On static binary translation of ARM/Thumb Mixed ISA binaries

Jiunn Yeu Chen, Wuu Yang, Wei Chung Hsu, Bor Yeh Shen, Quan Huei Ou

Research output: Contribution to journalArticle

1 Scopus citations

Abstract

Code discovery has been amain challenge for static binary translation, especially when the source instruction set architecture has variable-length instructions, such as the x86 architectures. Due to embedded data such as PC (program counter)-relative data, jump tables, or paddings in the code section, a binary translator may be misled to translate data as instructions. For variable-length instructions, once a piece of data is mis-translated as instructions, decoding subsequent bytes could also go wrong. We are concerned with static binary translation for the very popular Advanced RISC Machine (ARM) architectures. Although ARM is considered a reduced instruction set computer architecture, it does allow the mix of 32-bit (ARM) instructions and 16-bit (Thumb) instructions in the same executables. In addition to different instruction lengths, the ARM and Thumb instructions are located at 4-byte or 2-byte aligned addresses, respectively. Furthermore, because ARM and Thumb instructions share the same encoding space, a 4-byte word could sometimes be decoded as one ARM instruction or two Thumb instructions. The correct decoding of this 4-byte word is actually determined at runtime by the least-significant bit of the program counter. For unstripped binaries, the mapping symbols can be used to identify ARM code regions and Thumb code regions. However, for stripped binaries, such mapping symbols are unavailable. We propose a novel solution to statically translate stripped ARM/Thumb mixed executables. Our solution is implemented in a static binary translator. The binary translator further generates multiple versions of translated code for the code regions whose types cannot be determined with our solution. One of the code versions is selected during runtime. The binary translator also includes a series of analyses that enable the removal ofmost useless code versions. Based on the experimental results on stripped ARM/Thumb mixed binaries in the SPEC2006 and Embedded Microprocessor Benchmark Consortium (EEMBC) benchmark suites, our static binary translator achieves impressive performance when migrating them to run on x86 machines and the space overhead is no more than 10%.

Original languageEnglish
Article number81
JournalACM Transactions on Embedded Computing Systems
Volume16
Issue number3
DOIs
StatePublished - 1 Mar 2017

Keywords

  • Code discovery problem
  • Reverse engineering
  • Static binary translation

Fingerprint Dive into the research topics of 'On static binary translation of ARM/Thumb Mixed ISA binaries'. Together they form a unique fingerprint.

  • Cite this