Tutorials

Android Hacking: Part 1 – Decompilation & Source Code

This post is also available in: Deutsch

In the following tutorial we will take a closer look on how to decompile any Android app and obtain the source code of the application. We will also learn how decompiling can reveal secret information.

To have a practical example, I have written a demo app, which you can freely download here. Additionally, I have also uploaded the original source code of the app to GitHub. But as we progress, it should become clear that we do not need the actual source code.

Let’s start from the beginning.

Why decompiling?


Getting the source code of an app – why should we want that anyway? This is best illustrated using our demo app for this tutorial:

The app is rather simple and basically just a small vault for your imaginary money. The safe requires the correct password combination to open its door. Of course, we could start by randomly guessing the combination, but if the password is more complex, we won’t get anywhere for quite some time.

However, in order to verify whether the entered password is correct or incorrect, the application must somehow know what the actual password is. Hence, if we manage to inspect the source code of the application, we might find clues to the actual password of the vault.

Although I wrote the app especially for this tutorial, the context is not that far away from reality. Even in the applications of large companies, the source code often contains equally sensitive information – whether it is the admin’s password, API tokens or even the credentials to the database. In addition, the source code allows us to understand the functionality of the app much more precisely and thus find vulnerabilities more easily.

What actually is an Android app?


Android applications are distributed as APK files, usually through Google’s Play Store. APK is short for Android Package and the file is essentially similar to a ZIP archive.

This means that if we unpack our demo application with any ZIP tool, we could in theory view the entire contents of the app:

unzip -d vault-unzip vault.apk
Dateiinhalt der Android-App
Content of the un-zipped APK file

Unfortunately, it’s a little more complicated in reality than that.

For example, if we take a look at the file AndroidManifest.xml, we see only non-readable binary code. While the app is unpacked, it is still compiled.

Kompilierte AndroidManifest.xml
Compiled content of the AndroidManifest.xml

The classes.dex file contains the application’s actual code, currently also in the form of compiled binary code. DEX stands for Dalvik Executable. The Android operating system creates its own virtualised Dalvik environment when starting the app and then executes the code of the DEX file inside it.

If we manage to view the contents of the classes.dex file, we might also have a pretty good chance to find the password.

Disassembling vs. Decompiling


It is now necessary to distinguish between disassembling on the one hand and decompiling on the other.

Disassembling

When you disassemble byte code, it gets converted into a low-level and machine-oriented language, which is yet readable for us. With Android apps, the DEX file is converted into so-called Smali code.

We can then analyze the Smali code and manipulate it as we like. Afterwards, we can rebuild the app and run it on our smartphones again. This allows us, for example, to unlock restricted functionalities in an app or to have infinite lives in a game. We will learn how to do this in the second part of the Android Hacking course.

Decompiling

Decompiling converts the byte code, unlike disassembling, into a high-level programming language. The goal is to reverse the compilation process and restore the source code as close to the original as possible. The quality of the reconstructed code is strongly dependent on the overall quality of the used decompiler.

Android applications can basically be decompiled in two different ways. The most common approach is to decompile the app in two separate steps. Why this method, despite its popularity, has various disadvantages will become more obvious later on.

Method 1: Dex → Jar → Java


Android Hacking: Dex to Jar to Java

In the first method, the Dalvik binary code of the DEX file is first converted to Java binary code. As a result, we get a compiled Java archive in the form of a JAR file. We can then use any Java decompiler to convert the JAR archive back into readable Java files. This method is most popular, because Java is relatively old and many good decompilers are already out there.

In order to convert the DEX file to a JAR file, we will use dex2jar. This open source tool is available for free for all platforms.

d2j-dex2jar vault.apk

Next we will use JD-GUI to open the JAR archive. The open source Java decompiler is also available for all platforms.

Now we see that the JAR file contains a total of 4 root packages. The first three packages are not that big of a deal, since they basically contain utility libraries we don’t care about.

However, in the package digital.basto.vault we will find the exciting classes.

First, we have the BuildConfig with some meta information about the application, the UI package with the classes controlling the user interface, and the Data package, which controls the data structure.

After a brief investigation, we discover the class VaultDataSource. There you will find the parameter vaultCombination with its string value Subscr1be!. Doesn’t this sound like a promising string we should test in the application?

In the app, we now enter Subscr1be! as the password, confirm with a click on “Unlock” and…

…it works! By decompiling and analyzing the source code, we were able to crack the vault and can now access the immense fortune of 1337 euros and 42 cents!

A little tip for more complex applications:
In JD-GUI, it is also possible to decompile all classes at once and export the entire source code via “File → Save all Sources”. You can then open the exported source code in any editor. This makes it easier to browse the files or add comments. In addition, you can give the individual variables and methods more meaningful names. The latter is particularly useful for making code in foreign languages or obfuscated code easier to understand.

Now lets take a look on how to decompile DEX files or entire APKs directly into Java code without any additional steps.

Method 2: APK → JAVA


Android Hacking: APK to Java

In the more modern approach, we convert the APK file directly into the corresponding Java files. The biggest advantage of this method is, that on one hand it’s less complicated, but on the other hand we also lose much less meta information. This also allows for much better outcomes. Unfortunately, there are relatively few Android decompilers on the market.

To decompile the app from binary code directly into Java classes, we use the Android decompiler JADX. With JADX, we can simply open the APK file and view the source code.

This shows the most significant benefit of an Android decompiler compared to a Java decompiler. In addition to the source code from the DEX file, we also get the decompiled AndroidManifest, information about the certificate and all kinds of other meta-information that might help us with our analyses.

The face-to-face comparison: JD-GUI vs. JADX


A direct comparison between the generated source code from JADX and JD-GUI shows that JADX not only has the better code highlighting, but also produces significantly better results.

Original Code

Below we first see the original source code of the file, as I have also published it on Github. The string parameter password is passed to the unlock method, which then verifies it in an if-else comparison.

Original

package digital.basto.vault.data;
import digital.basto.vault.data.model.VaultData;
import java.io.IOException;
import java.security.AccessControlException;

public class VaultDataSource {
    private String vaultCombination = "Subscr1be!";

    public Result<VaultData> unlock(String password) {
        try {
            if (vaultCombination.equals(password)) {
                VaultData unlockData = new VaultData(1337.42f);
                return new Result.Success<>(unlockData);
            } else {
                return new Result.Error(new AccessControlException("Wrong password!"));
            }
        } catch (Exception e) {
            return new Result.Error(new IOException("Error unlocking view!", e));
        }
    }

    public void lock() {
    }
}

Results from JD-GUI

If we now take a look at the results from JD-GUI, we can see that the logic is essentially the same, but the code is much more complex to analyze. First, the methods are listed in reverse order, since the lock method is listed before the unlock method. In addition, the passed parameter is named paramString instead of password. This is particularly confusing, as the same name is used again for the following exception. To increase the complexity a bit further, JD-GUI uses the shortened notation <condition> ? <True> : <False> instead of the classic if-else comparison.

JD-GUI

package digital.basto.vault.data;
import digital.basto.vault.data.model.VaultData;
import java.io.IOException;
import java.security.AccessControlException;

public class VaultDataSource {
  private String vaultCombination = "Subscr1be!";
  
  public void lock() {}
  
  public Result<VaultData> unlock(String paramString) {
    try {
      return this.vaultCombination.equals(paramString) 
          ? new Result.Success(new VaultData(1337.42F)) 
          : new Result.Error(new AccessControlException("Wrong password!"));
    } catch (Exception paramString) {
      return new Result.Error(new IOException("Error unlocking view!", paramString));
    } 
  }
}

Results from JADX

JADX, on the other hand, has done an excellent job. The order of the methods is the same as in the original and all parameters are named correctly. The only difference is that the if-else comparison was reduced to a simple if comparison. However, this does not hurt its simplicity and some would even prefer this notation.

JADX

package digital.basto.vault.data;
import digital.basto.vault.data.Result.Error;
import digital.basto.vault.data.Result.Success;
import digital.basto.vault.data.model.VaultData;
import java.io.IOException;
import java.security.AccessControlException;

public class VaultDataSource {
    private String vaultCombination = "Subscr1be!";

    public Result<VaultData> unlock(String password) {
        try {
            if (this.vaultCombination.equals(password)) {
                return new Success(new VaultData(1337.42f));
            }
            return new Error(new AccessControlException("Wrong password!"));
        } catch (Exception e) {
            return new Error(new IOException("Error unlocking view!", e));
        }
    }

    public void lock() {
    }
}

Conclusion


We learned the first and one of the most important steps in the analysis of Android apps: effectively decompiling and restoring the source code of an application.

Choosing the right tools is critical when searching for vulnerabilities. Although the variety of Android decompilers cannot keep up with the number of Java decompilers, JADX is a great and also free alternative you can use. Apart from additional meta-information, JADX also offers the ability to de-obfuscate the code and usually produces better results. The direct comparison of the two methods is definitely recommended.

Continue here with Part 2: Manipulating Apps