PlaidCTF 2015 “RE GEX” Write-up

This challenge came with a simple Python TCP server and a monstrous regex in a separate file. It consists of multiple sub-patterns (2563 to be exact), separated by OR operators1. To get the flag, the input supplied must NOT match any of these sub-patterns. If you break up the large regex expression, you see something like this:

^(.*[^plaidctf].*|
.{,170}|
.{172,}|
.{88}[padt].{60}[licf].{6}[plai].{14}|
.{60}[aitf].{17}[pldc].{85}[dctf].{6}|
    ...
.{54}[pldc].{88}[plai][aitf].{26}|
.{11}[aitf].{41}[aitf].{97}[padt].{19})$

The 3 conditions listed up front give you a hint to the requirements: must be made up of the letters plaidctf and be of length 171. Any patterns that DO match will NOT reveal the flag.

I tried the dumb way, which is to brute force the sequence. Python’s itertools.permutations killed the process, so I thought to represent the string using a large number instead, composed of 3 bits (to represent one of plaidctf) for each “character”. Python supports arbitrary sized integers, so manipulating and counting integers up to 171 * 3 bits wasn’t an issue. Well, of course that process took forever…

Z3 to the rescue!

The sane (and probably only) way to solve it was to use a theorem prover like Z3, so I decided to invest some time to play around with it, even though it was too late to submit the flag.

The regex patterns were probably emitted by some program itself, so it followed some regular syntax, which meant that I didn’t have to resort to using a regex parser. There were only 3 types of expressions used: a single character (.), n-characters (.{n}), and character classes ([xyz]). The patterns were parsed into Z3 constraints, in which we only had to restrict the character classes imposed by the each pattern at particular positions. A regex pattern like .{88}[padt].{60}[licf].{6}[plai].{14} is turned into a constraint expression like this:

    Or(And(pattern >> 246 & 7 != 0,
           pattern >> 246 & 7 != 2,
           pattern >> 246 & 7 != 4,
           pattern >> 246 & 7 != 6),
       And(pattern >> 63 & 7 != 1,
           pattern >> 63 & 7 != 3,
           pattern >> 63 & 7 != 5,
           pattern >> 63 & 7 != 7),
       And(pattern >> 42 & 7 != 0,
           pattern >> 42 & 7 != 1,
           pattern >> 42 & 7 != 2,
           pattern >> 42 & 7 != 3))

Solving this contraint yields the correct solution for this particular regex pattern. By combining all constraints with an And operator, the model will provide the correct solution for this challenge. You can see that I retained the representation of the pattern as a BitVector of length 171 * 3.

You can find my code here, which expects the regex patterns to be pre-processed and provided as input. The model took 11min 50sec to solve on my machine. With that, the server returns flag{np_hard_reg3x_ftw!!!1_ftdtatililactldtadf}.

Seeing as how Z3 is getting more popular in CTFs, I hope to use this new found ability for future challenges.

If you’d like to see what else Z3 can do, check out some of the inspiring entries on Diary of a reverse-engineer.


  1. Well it’s not really called that, but you get the point 

Paper Shredder Repair

Now is probably a good time to mention that I have a paper shredder. When I was shopping for a shredder, the basic requirement is that it must be relatively “secure”. Straight cut shredders (that produce long straight strips) are definitely not secure.

Ultimately I settled on the CARL DS-3000 personal paper shredder. The DS-3000 is a cross-cut shredder which produces “particles” no larger than 2mm x 4.5mm and this meets DIN security level 4. These days, the NSA mandates 1mm x 5mm “particles” for classified documents.

At this point, it’s probably helpful to show you what my shredder bin looks like:

shredder confetti

From the particles, you can make out various truncated words such as “A/C”, “exp” and the number “5”, but it’s almost impossible to reconstruct any bank balances or personal information from it.

This particular model was the right balance between my budget and the level of security. Plus, the shredder is compact enough to sit on your desk. I bought it in 2009 and I use it every couple of months when I have accumulated enough material that needs to be destroyed.

I was in the middle of shredding papers when it suddenly stopped working. Now the shredder does not respond when I stick paper into its slot. The LED indicator looks dimmer than usual when it is turned on.

But I’m not ready to give up on it just yet…

Continue reading

Interesting 31C3 Talks

31C3 logo "a new dawn"

The 31st Chaos Communication Congress (31C3) ended just 3 days ago, and there were several interesting talks.

They have got live streaming of the event over the web, as well as encourage you to use an external player with RTMP or HLS support. The video streams were very reliable and best of all, it’s available in HD. In comparison, I tried the Apple live event once and it was really crappy. For one, the HLS1 URL is not publicly available , so someone had to dig that out and post it. Even after that, the audio stream was (I believe, unintentionally) a mix of both English and Chinese simultaneously.

The 31C3 video recordings were also uploaded very quickly after the event. This is much quicker than other events such as Black Hat (although as an attendee, you do get a copy of the stuff on a DVD). A really big kudos to the organizers and the video production team!

If you don’t have time to listen to each and every talk, here are a few selected talks that were interesting to me, as well as a short summary to see if it’s worth 30 or 60 minutes of your time.

A full list of talks can be found here: http://media.ccc.de/browse/congress/2014/index.html

Continue reading

Visualizing Binary Features with matplotlib

Some time ago, I started playing around with data analysis and machine learning. One of the more popular tools for such tasks is IPython Notebook, a browser-based interactive REPL shell based on IPython. Each session becomes a “notebook” that records the entire REPL session with both inputs and (cached) outputs, which can be saved and reviewed at a later time, or exported into another format like HTML. This capability, combined with matplotlib for plotting and pandas for slicing and dicing data makes this a handy tool for analyzing and visualizing data. To give you an idea of how useful this tool can be, take a look at some example notebooks using the online notebook viewer.

In this quick post, I’ll describe how I visualize binary features (present/not present) and clustering of such data. I am assuming that you already have experience with all of the above-mentioned libraries. For this example, I’ve extracted permissions (uses-permission) and features (uses-feature) used by a set of Android apps using Androguard. The resulting visualization looks like this:

visualization of binary features

Each row represents one app and each column represents one feature. More specifically, each column represent whether a permission or feature is used by the app. Such a visualization makes it easy to see patterns, such as which permission or feature is more frequently used by apps (shown as downward streaks), or whether an app uses more or less features compared to other apps (which shows up as horizontal streaks).

While this may look relatively trivial, when the number of samples increase to thousands of apps, it becomes difficult to make sense of all the rows & columns in the data table by staring at it.

Continue reading

Android Internals: Package Verifiers

Inspired by Nikolay Elenkov’s detailed technical posts on Android Explorations, I decided to dig into the Android source code myself and document the package verification mechanism in Android.

Package verification was introduced in Android 4.2 to allow for apps to be verified or checked before they are installed. If you have tried to install a malicious app on a production Android device, you might have seen the following screen, displayed by the verifier:

screenshot of malicious app install warning

Android was built in such a way that it tries to be generic for third-parties to implement stuff. Package verification is a feature that is currently only used and implemented by Google, but it is abstracted in such a way that any manufacturer can implement their own. Documentation and examples on how to do this is almost non-existent, although anyone determined enough can read the Android source code and figure it out for themselves.

Continue reading