unicorn
Unicorn CPU emulator framework (ARM, AArch64, M68K, Mips, Sparc, PowerPC, RiscV, S390x, TriCore, X86)
Top Related Projects
Keystone assembler framework: Core (Arm, Arm64, Hexagon, Mips, PowerPC, Sparc, SystemZ & X86) + bindings
Official QEMU mirror. Please see https://www.qemu.org/contribute/ for how to submit changes to QEMU. Pull Requests are ignored. Please only use release tarballs from the QEMU website.
A powerful and user-friendly binary analysis platform!
UNIX-like reverse engineering framework and command-line toolset
Capstone disassembly/disassembler framework for ARM, ARM64 (ARMv8), Alpha, BPF, Ethereum VM, HPPA, LoongArch, M68K, M680X, Mips, MOS65XX, PPC, RISC-V(rv32G/rv64G), SH, Sparc, SystemZ, TMS320C64X, TriCore, Webassembly, XCore and X86.
RetDec is a retargetable machine-code decompiler based on LLVM.
Quick Overview
Unicorn is a lightweight, multi-platform, multi-architecture CPU emulator framework based on QEMU. It provides a clean and simple API for emulating various CPU architectures, including x86, ARM, MIPS, and more. Unicorn is designed for use in reverse engineering, malware analysis, and security research.
Pros
- Supports multiple CPU architectures and platforms
- Easy-to-use API with bindings for various programming languages
- High performance and lightweight design
- Actively maintained and well-documented
Cons
- Limited to CPU emulation, doesn't emulate full system or peripherals
- May have a learning curve for users unfamiliar with low-level CPU operations
- Some advanced features may require in-depth knowledge of CPU architecture
Code Examples
- Basic x86 code emulation:
from unicorn import *
from unicorn.x86_const import *
# X86 code to emulate
X86_CODE32 = b"\x41\x4a" # INC ecx; DEC edx
# Initialize emulator in X86-32bit mode
mu = Uc(UC_ARCH_X86, UC_MODE_32)
# Map 2MB memory for this emulation
mu.mem_map(0x1000000, 2 * 1024 * 1024)
# Write machine code to be emulated to memory
mu.mem_write(0x1000000, X86_CODE32)
# Initialize registers
mu.reg_write(UC_X86_REG_ECX, 0x1234)
mu.reg_write(UC_X86_REG_EDX, 0x7890)
# Emulate code
mu.emu_start(0x1000000, 0x1000000 + len(X86_CODE32))
# Read results
print("ECX = 0x%x" % mu.reg_read(UC_X86_REG_ECX))
print("EDX = 0x%x" % mu.reg_read(UC_X86_REG_EDX))
- ARM code emulation with hooks:
from unicorn import *
from unicorn.arm_const import *
def hook_code(uc, address, size, user_data):
print(">>> Tracing instruction at 0x%x, instruction size = 0x%x" %(address, size))
# ARM code to emulate
ARM_CODE = b"\x37\x00\xa0\xe3\x03\x10\x42\xe0" # mov r0, #0x37; sub r1, r2, r3
# Initialize emulator in ARM mode
mu = Uc(UC_ARCH_ARM, UC_MODE_ARM)
# Map 2MB memory for this emulation
mu.mem_map(0x10000, 2 * 1024 * 1024)
# Write machine code to be emulated to memory
mu.mem_write(0x10000, ARM_CODE)
# Add hook
mu.hook_add(UC_HOOK_CODE, hook_code)
# Emulate code
mu.emu_start(0x10000, 0x10000 + len(ARM_CODE))
- Memory access tracking:
from unicorn import *
from unicorn.x86_const import *
def hook_mem_access(uc, access, address, size, value, user_data):
if access == UC_MEM_WRITE:
print(">>> Memory is being WRITTEN at 0x%x, data size = %u, data value = 0x%x" \
%(address, size, value))
else: # READ
print(">>> Memory is being READ at 0x%x, data size = %u" \
%(address, size))
# X86 code to emulate
X86_CODE32 = b"\x89\x0D\x12\x00\x00\x00" # mov [0x12], ecx
# Initialize emulator in X86-32bit mode
mu = Uc(UC_ARCH_X86, UC_MODE_32)
Competitor Comparisons
Keystone assembler framework: Core (Arm, Arm64, Hexagon, Mips, PowerPC, Sparc, SystemZ & X86) + bindings
Pros of Keystone
- Specialized in assembly and disassembly, offering more focused functionality
- Supports a wider range of architectures, including some less common ones
- Generally faster for pure assembly/disassembly tasks
Cons of Keystone
- Limited to assembly and disassembly, lacking emulation capabilities
- Smaller community and fewer resources compared to Unicorn
- Less frequent updates and maintenance
Code Comparison
Keystone (Assembly):
from keystone import *
ks = Ks(KS_ARCH_X86, KS_MODE_32)
encoding, count = ks.asm("add eax, ebx")
Unicorn (Emulation):
from unicorn import *
mu = Uc(UC_ARCH_X86, UC_MODE_32)
mu.mem_map(0x1000, 0x1000)
mu.mem_write(0x1000, b"\x01\xd8") # add eax, ebx
mu.emu_start(0x1000, 0x1002)
Keystone focuses on assembly, while Unicorn provides full emulation capabilities. Keystone is more suitable for tasks requiring only assembly or disassembly, while Unicorn offers a complete CPU emulation environment, making it more versatile for complex reverse engineering and security research tasks.
Official QEMU mirror. Please see https://www.qemu.org/contribute/ for how to submit changes to QEMU. Pull Requests are ignored. Please only use release tarballs from the QEMU website.
Pros of QEMU
- Full system emulation, supporting a wide range of architectures and devices
- More mature project with extensive documentation and community support
- Supports hardware-assisted virtualization for improved performance
Cons of QEMU
- Larger codebase and more complex to set up and use
- Higher resource consumption due to full system emulation
- Slower execution speed for certain use cases compared to Unicorn
Code Comparison
QEMU (initializing an x86_64 machine):
QemuOpts *opts = qemu_opts_create(qemu_find_opts("machine"), NULL, 0, &error_abort);
qemu_opt_set(opts, "type", "q35", &error_abort);
current_machine = machine_parse(opts, current_machine_class);
Unicorn (initializing an x86_64 engine):
uc_engine *uc;
uc_err err = uc_open(UC_ARCH_X86, UC_MODE_64, &uc);
if (err != UC_ERR_OK) {
printf("Failed on uc_open() with error: %u\n", err);
}
Both QEMU and Unicorn are powerful emulation tools, but they serve different purposes. QEMU is more suitable for full system emulation and virtualization, while Unicorn is designed for lightweight, flexible CPU emulation in various scenarios such as reverse engineering and malware analysis.
A powerful and user-friendly binary analysis platform!
Pros of angr
- More comprehensive analysis framework with symbolic execution capabilities
- Supports higher-level program analysis tasks like vulnerability discovery
- Includes a powerful constraint solver for complex path analysis
Cons of angr
- Steeper learning curve due to its complexity and broader feature set
- Generally slower execution compared to Unicorn's lightweight emulation
- Requires more system resources for full-scale analysis tasks
Code Comparison
angr example (simplified):
import angr
proj = angr.Project('binary')
state = proj.factory.entry_state()
simgr = proj.factory.simulation_manager(state)
simgr.explore(find=0x400000)
Unicorn example:
from unicorn import *
mu = Uc(UC_ARCH_X86, UC_MODE_64)
mu.mem_map(0x1000, 0x1000)
mu.mem_write(0x1000, b'\x90\x90\x90\x90')
mu.emu_start(0x1000, 0x1004)
angr provides a higher-level interface for program analysis, while Unicorn offers low-level CPU emulation. angr is better suited for complex analysis tasks, whereas Unicorn excels in lightweight, fast emulation scenarios. The choice between them depends on the specific requirements of the project and the depth of analysis needed.
UNIX-like reverse engineering framework and command-line toolset
Pros of Radare2
- More comprehensive reverse engineering toolkit with disassembly, debugging, and analysis capabilities
- Extensive scripting support with multiple languages (Python, JavaScript, etc.)
- Large and active community with frequent updates and contributions
Cons of Radare2
- Steeper learning curve due to its complex feature set and command-line interface
- Can be resource-intensive for large binaries or complex analysis tasks
- Less focused on pure emulation compared to Unicorn
Code Comparison
Radare2 (disassembly example):
r2 -qc 'pd 5' /bin/ls
0x00005850 31ed xor ebp, ebp
0x00005852 4989d1 mov r9, rdx
0x00005855 5e pop rsi
0x00005856 4889e2 mov rdx, rsp
0x00005859 4883e4f0 and rsp, 0xfffffffffffffff0
Unicorn (emulation example):
mu = Uc(UC_ARCH_X86, UC_MODE_64)
mu.mem_map(ADDRESS, 2 * 1024 * 1024)
mu.mem_write(ADDRESS, X86_CODE64)
mu.emu_start(ADDRESS, ADDRESS + len(X86_CODE64))
While Radare2 focuses on static analysis and disassembly, Unicorn specializes in CPU emulation. Radare2 offers a more comprehensive toolkit for reverse engineering, while Unicorn provides a lightweight and flexible emulation engine for various architectures.
Capstone disassembly/disassembler framework for ARM, ARM64 (ARMv8), Alpha, BPF, Ethereum VM, HPPA, LoongArch, M68K, M680X, Mips, MOS65XX, PPC, RISC-V(rv32G/rv64G), SH, Sparc, SystemZ, TMS320C64X, TriCore, Webassembly, XCore and X86.
Pros of Capstone
- Lightweight and focused solely on disassembly, making it more efficient for specific tasks
- Supports a wider range of architectures, including less common ones
- Generally faster for pure disassembly operations
Cons of Capstone
- Limited to disassembly only, lacking emulation capabilities
- Requires additional tools or libraries for more complex analysis tasks
- Less suitable for dynamic analysis or runtime manipulation
Code Comparison
Capstone (disassembly):
cs_insn *insn;
size_t count = cs_disasm(handle, code, code_size, address, 0, &insn);
for (size_t j = 0; j < count; j++) {
printf("0x%"PRIx64":\t%s\t\t%s\n", insn[j].address, insn[j].mnemonic, insn[j].op_str);
}
Unicorn (emulation):
uc_emu_start(uc, address, address + code_size, 0, 0);
uint32_t r_eax;
uc_reg_read(uc, UC_X86_REG_EAX, &r_eax);
printf("EAX = 0x%x\n", r_eax);
Capstone excels at disassembly, while Unicorn provides emulation capabilities. Choose based on your specific needs: Capstone for static analysis and disassembly, Unicorn for dynamic analysis and emulation.
RetDec is a retargetable machine-code decompiler based on LLVM.
Pros of RetDec
- Comprehensive decompilation capabilities for multiple architectures
- Generates high-level C code from binary inputs
- Integrates various analysis tools for enhanced reverse engineering
Cons of RetDec
- Larger and more complex codebase, potentially harder to contribute to
- Slower execution compared to Unicorn's emulation approach
- More resource-intensive due to its comprehensive analysis features
Code Comparison
RetDec (C++ decompilation):
void decompile(const std::string& inputFile) {
retdec::config::Config config;
config.setInputFile(inputFile);
auto decompiler = retdec::decompiler::createDecompiler(config);
decompiler->run();
}
Unicorn (Python emulation):
def emulate(code, address):
mu = Uc(UC_ARCH_X86, UC_MODE_32)
mu.mem_map(address, 1024 * 1024)
mu.mem_write(address, code)
mu.emu_start(address, address + len(code))
RetDec focuses on decompiling binaries to high-level code, while Unicorn provides lightweight CPU emulation. RetDec offers more comprehensive analysis but requires more resources, whereas Unicorn excels in speed and simplicity for specific emulation tasks.
Convert designs to code with AI
Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.
Try Visual CopilotREADME
Unicorn Engine
Unicorn is a lightweight, multi-platform, multi-architecture CPU emulator framework, based on QEMU.
Unicorn offers some unparalleled features:
- Multi-architecture: ARM, ARM64 (ARMv8), M68K, MIPS, PowerPC, RISCV, SPARC, S390X, TriCore and X86 (16, 32, 64-bit)
- Clean/simple/lightweight/intuitive architecture-neutral API
- Implemented in pure C language, with bindings for Crystal, Clojure, Visual Basic, Perl, Rust, Ruby, Python, Java, .NET, Go, Delphi/Free Pascal, Haskell, Pharo, Lua and Zig.
- Native support for Windows & *nix (with Mac OSX, Linux, Android, *BSD & Solaris confirmed)
- High performance via Just-In-Time compilation
- Support for fine-grained instrumentation at various levels
- Thread-safety by design
- Distributed under free software license GPLv2
Further information is available at http://www.unicorn-engine.org
License
This project is released under the GPL license.
Compilation & Docs
See docs/COMPILE.md file for how to compile and install Unicorn.
More documentation is available in docs/README.md.
Contact
Contact us via mailing list, email or twitter for any questions.
Contribute
If you want to contribute, please pick up something from our Github issues.
We also maintain a list of more challenged problems in milestones for our regular release.
Please send pull request to our dev branch.
CREDITS.TXT records important contributors of our project.
Top Related Projects
Keystone assembler framework: Core (Arm, Arm64, Hexagon, Mips, PowerPC, Sparc, SystemZ & X86) + bindings
Official QEMU mirror. Please see https://www.qemu.org/contribute/ for how to submit changes to QEMU. Pull Requests are ignored. Please only use release tarballs from the QEMU website.
A powerful and user-friendly binary analysis platform!
UNIX-like reverse engineering framework and command-line toolset
Capstone disassembly/disassembler framework for ARM, ARM64 (ARMv8), Alpha, BPF, Ethereum VM, HPPA, LoongArch, M68K, M680X, Mips, MOS65XX, PPC, RISC-V(rv32G/rv64G), SH, Sparc, SystemZ, TMS320C64X, TriCore, Webassembly, XCore and X86.
RetDec is a retargetable machine-code decompiler based on LLVM.
Convert designs to code with AI
Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.
Try Visual Copilot