unicorn

Unicorn CPU emulator framework (ARM, AArch64, M68K, Mips, Sparc, PowerPC, RiscV, S390x, TriCore, X86)

8,398

1,425

8,398

134

View on GitHub

Top Related Projects

keystone

2,456

Keystone assembler framework: Core (Arm, Arm64, Hexagon, Mips, PowerPC, Sparc, SystemZ & X86) + bindings

qemu

11,780

Official QEMU mirror. Please see https://www.qemu.org/contribute/ for how to submit changes to QEMU. Pull Requests are ignored. Please only use release tarballs from the QEMU website.

angr

8,086

A powerful and user-friendly binary analysis platform!

radare2

22,027

UNIX-like reverse engineering framework and command-line toolset

Capstone disassembly/disassembler framework for ARM, ARM64 (ARMv8), Alpha, BPF, Ethereum VM, HPPA, LoongArch, M68K, M680X, Mips, MOS65XX, PPC, RISC-V(rv32G/rv64G), SH, Sparc, SystemZ, TMS320C64X, TriCore, Webassembly, XCore and X86.

retdec

8,316

RetDec is a retargetable machine-code decompiler based on LLVM.

Quick Overview

Unicorn is a lightweight, multi-platform, multi-architecture CPU emulator framework based on QEMU. It provides a clean and simple API for emulating various CPU architectures, including x86, ARM, MIPS, and more. Unicorn is designed for use in reverse engineering, malware analysis, and security research.

Pros

Supports multiple CPU architectures and platforms
Easy-to-use API with bindings for various programming languages
High performance and lightweight design
Actively maintained and well-documented

Cons

Limited to CPU emulation, doesn't emulate full system or peripherals
May have a learning curve for users unfamiliar with low-level CPU operations
Some advanced features may require in-depth knowledge of CPU architecture

Code Examples

Basic x86 code emulation:

from unicorn import *
from unicorn.x86_const import *

# X86 code to emulate
X86_CODE32 = b"\x41\x4a" # INC ecx; DEC edx

# Initialize emulator in X86-32bit mode
mu = Uc(UC_ARCH_X86, UC_MODE_32)

# Map 2MB memory for this emulation
mu.mem_map(0x1000000, 2 * 1024 * 1024)

# Write machine code to be emulated to memory
mu.mem_write(0x1000000, X86_CODE32)

# Initialize registers
mu.reg_write(UC_X86_REG_ECX, 0x1234)
mu.reg_write(UC_X86_REG_EDX, 0x7890)

# Emulate code
mu.emu_start(0x1000000, 0x1000000 + len(X86_CODE32))

# Read results
print("ECX = 0x%x" % mu.reg_read(UC_X86_REG_ECX))
print("EDX = 0x%x" % mu.reg_read(UC_X86_REG_EDX))

ARM code emulation with hooks:

from unicorn import *
from unicorn.arm_const import *

def hook_code(uc, address, size, user_data):
    print(">>> Tracing instruction at 0x%x, instruction size = 0x%x" %(address, size))

# ARM code to emulate
ARM_CODE = b"\x37\x00\xa0\xe3\x03\x10\x42\xe0" # mov r0, #0x37; sub r1, r2, r3

# Initialize emulator in ARM mode
mu = Uc(UC_ARCH_ARM, UC_MODE_ARM)

# Map 2MB memory for this emulation
mu.mem_map(0x10000, 2 * 1024 * 1024)

# Write machine code to be emulated to memory
mu.mem_write(0x10000, ARM_CODE)

# Add hook
mu.hook_add(UC_HOOK_CODE, hook_code)

# Emulate code
mu.emu_start(0x10000, 0x10000 + len(ARM_CODE))

Memory access tracking:

from unicorn import *
from unicorn.x86_const import *

def hook_mem_access(uc, access, address, size, value, user_data):
    if access == UC_MEM_WRITE:
        print(">>> Memory is being WRITTEN at 0x%x, data size = %u, data value = 0x%x" \
                %(address, size, value))
    else:   # READ
        print(">>> Memory is being READ at 0x%x, data size = %u" \
                %(address, size))

# X86 code to emulate
X86_CODE32 = b"\x89\x0D\x12\x00\x00\x00" # mov [0x12], ecx

# Initialize emulator in X86-32bit mode
mu = Uc(UC_ARCH_X86, UC_MODE_32)

Competitor Comparisons

keystone

2,456

Keystone assembler framework: Core (Arm, Arm64, Hexagon, Mips, PowerPC, Sparc, SystemZ & X86) + bindings

Pros of Keystone

Specialized in assembly and disassembly, offering more focused functionality
Supports a wider range of architectures, including some less common ones
Generally faster for pure assembly/disassembly tasks

Cons of Keystone

Limited to assembly and disassembly, lacking emulation capabilities
Smaller community and fewer resources compared to Unicorn
Less frequent updates and maintenance

Code Comparison

Keystone (Assembly):

from keystone import *

ks = Ks(KS_ARCH_X86, KS_MODE_32)
encoding, count = ks.asm("add eax, ebx")

Unicorn (Emulation):

from unicorn import *

mu = Uc(UC_ARCH_X86, UC_MODE_32)
mu.mem_map(0x1000, 0x1000)
mu.mem_write(0x1000, b"\x01\xd8")  # add eax, ebx
mu.emu_start(0x1000, 0x1002)

Keystone focuses on assembly, while Unicorn provides full emulation capabilities. Keystone is more suitable for tasks requiring only assembly or disassembly, while Unicorn offers a complete CPU emulation environment, making it more versatile for complex reverse engineering and security research tasks.

qemu

11,780

Official QEMU mirror. Please see https://www.qemu.org/contribute/ for how to submit changes to QEMU. Pull Requests are ignored. Please only use release tarballs from the QEMU website.

Pros of QEMU

Full system emulation, supporting a wide range of architectures and devices
More mature project with extensive documentation and community support
Supports hardware-assisted virtualization for improved performance

Cons of QEMU

Larger codebase and more complex to set up and use
Higher resource consumption due to full system emulation
Slower execution speed for certain use cases compared to Unicorn

Code Comparison

QEMU (initializing an x86_64 machine):

QemuOpts *opts = qemu_opts_create(qemu_find_opts("machine"), NULL, 0, &error_abort);
qemu_opt_set(opts, "type", "q35", &error_abort);
current_machine = machine_parse(opts, current_machine_class);

Unicorn (initializing an x86_64 engine):

uc_engine *uc;
uc_err err = uc_open(UC_ARCH_X86, UC_MODE_64, &uc);
if (err != UC_ERR_OK) {
    printf("Failed on uc_open() with error: %u\n", err);
}

Both QEMU and Unicorn are powerful emulation tools, but they serve different purposes. QEMU is more suitable for full system emulation and virtualization, while Unicorn is designed for lightweight, flexible CPU emulation in various scenarios such as reverse engineering and malware analysis.

angr

8,086

A powerful and user-friendly binary analysis platform!

Pros of angr

More comprehensive analysis framework with symbolic execution capabilities
Supports higher-level program analysis tasks like vulnerability discovery
Includes a powerful constraint solver for complex path analysis

Cons of angr

Steeper learning curve due to its complexity and broader feature set
Generally slower execution compared to Unicorn's lightweight emulation
Requires more system resources for full-scale analysis tasks

Code Comparison

angr example (simplified):

import angr

proj = angr.Project('binary')
state = proj.factory.entry_state()
simgr = proj.factory.simulation_manager(state)
simgr.explore(find=0x400000)

Unicorn example:

from unicorn import *

mu = Uc(UC_ARCH_X86, UC_MODE_64)
mu.mem_map(0x1000, 0x1000)
mu.mem_write(0x1000, b'\x90\x90\x90\x90')
mu.emu_start(0x1000, 0x1004)

angr provides a higher-level interface for program analysis, while Unicorn offers low-level CPU emulation. angr is better suited for complex analysis tasks, whereas Unicorn excels in lightweight, fast emulation scenarios. The choice between them depends on the specific requirements of the project and the depth of analysis needed.

radare2

22,027

UNIX-like reverse engineering framework and command-line toolset

Pros of Radare2

More comprehensive reverse engineering toolkit with disassembly, debugging, and analysis capabilities
Extensive scripting support with multiple languages (Python, JavaScript, etc.)
Large and active community with frequent updates and contributions

Cons of Radare2

Steeper learning curve due to its complex feature set and command-line interface
Can be resource-intensive for large binaries or complex analysis tasks
Less focused on pure emulation compared to Unicorn

Code Comparison

Radare2 (disassembly example):

r2 -qc 'pd 5' /bin/ls
            0x00005850      31ed           xor ebp, ebp
            0x00005852      4989d1         mov r9, rdx
            0x00005855      5e             pop rsi
            0x00005856      4889e2         mov rdx, rsp
            0x00005859      4883e4f0       and rsp, 0xfffffffffffffff0

Unicorn (emulation example):

mu = Uc(UC_ARCH_X86, UC_MODE_64)
mu.mem_map(ADDRESS, 2 * 1024 * 1024)
mu.mem_write(ADDRESS, X86_CODE64)
mu.emu_start(ADDRESS, ADDRESS + len(X86_CODE64))

While Radare2 focuses on static analysis and disassembly, Unicorn specializes in CPU emulation. Radare2 offers a more comprehensive toolkit for reverse engineering, while Unicorn provides a lightweight and flexible emulation engine for various architectures.

capstone

8,186

Pros of Capstone

Lightweight and focused solely on disassembly, making it more efficient for specific tasks
Supports a wider range of architectures, including less common ones
Generally faster for pure disassembly operations

Cons of Capstone

Limited to disassembly only, lacking emulation capabilities
Requires additional tools or libraries for more complex analysis tasks
Less suitable for dynamic analysis or runtime manipulation

Code Comparison

Capstone (disassembly):

cs_insn *insn;
size_t count = cs_disasm(handle, code, code_size, address, 0, &insn);
for (size_t j = 0; j < count; j++) {
    printf("0x%"PRIx64":\t%s\t\t%s\n", insn[j].address, insn[j].mnemonic, insn[j].op_str);
}

Unicorn (emulation):

uc_emu_start(uc, address, address + code_size, 0, 0);
uint32_t r_eax;
uc_reg_read(uc, UC_X86_REG_EAX, &r_eax);
printf("EAX = 0x%x\n", r_eax);

Capstone excels at disassembly, while Unicorn provides emulation capabilities. Choose based on your specific needs: Capstone for static analysis and disassembly, Unicorn for dynamic analysis and emulation.

retdec

8,316

RetDec is a retargetable machine-code decompiler based on LLVM.

Pros of RetDec

Comprehensive decompilation capabilities for multiple architectures
Generates high-level C code from binary inputs
Integrates various analysis tools for enhanced reverse engineering

Cons of RetDec

Larger and more complex codebase, potentially harder to contribute to
Slower execution compared to Unicorn's emulation approach
More resource-intensive due to its comprehensive analysis features

Code Comparison

RetDec (C++ decompilation):

void decompile(const std::string& inputFile) {
    retdec::config::Config config;
    config.setInputFile(inputFile);
    auto decompiler = retdec::decompiler::createDecompiler(config);
    decompiler->run();
}

Unicorn (Python emulation):

def emulate(code, address):
    mu = Uc(UC_ARCH_X86, UC_MODE_32)
    mu.mem_map(address, 1024 * 1024)
    mu.mem_write(address, code)
    mu.emu_start(address, address + len(code))

RetDec focuses on decompiling binaries to high-level code, while Unicorn provides lightweight CPU emulation. RetDec offers more comprehensive analysis but requires more resources, whereas Unicorn excels in speed and simplicity for specific emulation tasks.

Convert designs to code with AI

Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.

Try Visual Copilot

README

Unicorn Engine

Unicorn is a lightweight, multi-platform, multi-architecture CPU emulator framework, based on QEMU.

Unicorn offers some unparalleled features:

Multi-architecture: ARM, ARM64 (ARMv8), M68K, MIPS, PowerPC, RISCV, SPARC, S390X, TriCore and X86 (16, 32, 64-bit)
Clean/simple/lightweight/intuitive architecture-neutral API
Implemented in pure C language, with bindings for Crystal, Clojure, Visual Basic, Perl, Rust, Ruby, Python, Java, .NET, Go, Delphi/Free Pascal, Haskell, Pharo, Lua and Zig.
Native support for Windows & *nix (with Mac OSX, Linux, Android, *BSD & Solaris confirmed)
High performance via Just-In-Time compilation
Support for fine-grained instrumentation at various levels
Thread-safety by design
Distributed under free software license GPLv2

Further information is available at http://www.unicorn-engine.org

License

This project is released under the GPL license.

Compilation & Docs

See docs/COMPILE.md file for how to compile and install Unicorn.

More documentation is available in docs/README.md.

For common questions, read docs/FAQ.md before raising an issue.

Contact

Join our group for instant feedback.

Contribute

If you want to contribute, please pick up something from our Github issues.

We also maintain a list of more challenged problems in milestones for our regular release.

Please send pull request to our dev branch.

CREDITS.TXT records important contributors of our project.

Top Related Projects

Convert designs to code with AI

Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.

Try Visual Copilot