Reverse Engineering: [Rootkit] TDL3 – “Why so serious? Let’s put a smile..”

11/17/09

[Rootkit] TDL3 – “Why so serious? Let’s put a smile..”

I. Introduction

TDL or TDSS family is a famous trojan variant for its effectiveness and active technical development. It contains two compoments: a kernel-mode rootkit and some user-mode DLLs which performs the trojan operation (downloaders, blocking Avs, etc,.). Since the rootkit acts as an “injector” and protector for the ring3 bot binaries, almost technical evolutions of this threat family focus on rootkit technology so as to evade AV scanners.

As in its name, TDL3 is 3^rd generation of TDL rootkit, still takes its aims at convering stealthy existences of malicious codes. Beside known features, this threats is exposed with a couple of impressive tricks which help it bypassing personal firewall and staying totally undetected by all AVs and ARKs at the moment. These aspects and techniques will be discussed in more detail in the sections that follow.

II. The Dropper

II.1 The packer

The dropper (0a374623f102930d3f1b6615cd3ef0f3) comes in packed and obfuscated as usual by a is similar packer other TDL/TDSS variants used in the past. Despite of the author’s attempt to bypass PE-file heuristics scanning by inserting several random API imports and exports, the sample still get detected by various heuristics based scanner.

II.2 The installation mechanism

There’s nothing interesting with the dropper except its unique approach for installation into systems. Instead of using known or documented method, this sample actually implements an “0day” to execute itself thus it can bypass some lame HIPS/personal firewalls easily.

Figure 1 illustrates pseudo-code snippet of one part of the dropper

Figure 1. Pseudo code of TDL3’s bypassing personal firewall method

First, the dropper copies itself into the Print Processor directory with a random name determined by the system, then it modifies the characteristics of the newly created file to convert it into a PE Dynamic Linked Library (DLL).

And here comes the interesting part of the dropper. After changing the characteristics, the dropper registers the malicious DLL file as an Print Processor which is named “tdl” by calling winspool API AddPrintProcessorA(). Internally, this API will issue an RPC call to the Printing Subsystem hosted by spoolsv.exe process and force spoolsv.exe to load the Print Processor DLL remotely. In this case, spoolsv.exe will execute the DLL version of the dropper copied inside the Print Processor directory inside the context of spoolsv.exe process. In fact, spoolsv.exe is usually a system-trusted process to almost personal firewalls hence the malicious DLL has the permission to do anything to the system without neither any notification nor alarm to the users.

Although this is a pretty cool method to remotely load and execute a malicious DLL into another trusted process, it has some limitations too. First, the caller must have SeLoadDriverPrivilege and second, it has to be able to write file to Print Processor directory. Moreover, when an application tries to acquire the SeLoadDriverPrivilege, some personal firewall will notify the user about that suspectious behaviour. Anyway, due to the fact that most of users aren’t technical aware and always log in with Administrator privilege, I guess the successful installation rate isn’t affected seriously by these aforementioned obstacles.

Figure 2. TDL3 user-mode dropper: Bypassing personal firewall mechanism

Back to the dropper, after being loaded into spoolsv.exe, the malicious DLL drops a driver and begins its second stage infection in kernel space by calling NtLoadDriver() directly.

II.3 The first kernel mode dropper stage: Unpacking

Now the battlefield takes place in kernel mode. The dropped driver loaded by spoolsv.exe is actually a loader for another embedded kernel codes. From the its DriverEntry(), the driver allocates kernel pool heap to copy the compressed data to and employs aPlib to unpack the real rootkit driver inside itself.

One thing worth to mention: the author employed a small trick in an attempt for anti-static analysis during this unpacking process. He first hooks an imported API in the IAT of current driver with the unpacking routine, then call that API, and because that API address in the IAT has been modified already, the execution is transferred to the real decompressing procedure. When an analyst uses static analysis (e.g IDA disassembly) he could miss the unpacking routine.

In the sample I analyze, the hooked API is RtlAppendAsciizToString.

Figure 3. TDL3 kernel mode dropper anti-static analysis: IAT self hooking

At the end of this stage, the loader performs the PE mapping against the unpacked driver over an NonpagedPool and finally jumps to that new zone, begins its second stage of kernel mode infection.

II.4 The second kernel mode dropper stage: Infecting & storing rootkit’s code

The real deal lies in the “freshly baked” codes. It does various things to survive the rootkit reboot, but the most important and interesting parts are:

o Infecting miniport driver

o Survive-reboot strategy

o Direct read/write to hard disk using SCSI class request

II.4. 1 Infecting driver

The infector first queries the device object responsible with partition0 on the hard disk device which the “\systemroot” is linked/installed on. It’s convinient for the rootkit to retrieve the last miniport driver object and the name of the driver’s binary file via that device object. For example, in my analysis, name of the driver is “atapi” while “\systemroot\system32\drivers\atapi.sys” is going to be infected.

The infecting algorithm isn’t complicated, it overwrites the data of “.rsrc” section of victim driver with 824 bytes instead of kidnapping the whole driver like others did (e.g Rustock.C), so that size of the infected file isn’t changed before and after the infection occurs. The original overwritten data is then stored to certain sectors on disk for later file content counterfeiting. The infector also modifies the entry point of infected file to address of the 824 bytes codes.

II.4. 2 Rootkit’s survive-reboot strategy

The previous variants of TDL/TDSS survive reboot by creating themselves startup services and keep their malicious codes in files normally. So what’s new in this TDL3? The author(s) made their decision to go lower & deeper. The rootkit no longer uses file system to store its files, it reads and writes directly onto disk’s sectors. The main rootkit’s code is stored at the last sectors of the disk with the sector number is calculated by formula total_number_of_disk – (number_of_rootkit_sector + number_of_overwritten_data_sector).

The next time system reboots, when the 824 bytes in infected driver gets executed, it waits for file system’s setup finishing (by registering itself a filesystem notification routine), then loads and runs the rootkit stored at last sectors of the disk.

Figure 5 demonstates how TDL3 performs the installation: the real rootkit’s codes and overwritten atapi.sys’s data are placed into a buffer at 0×817e1000. Total size of data to be written down is 0×5e00 bytes. Next, it writes this buffer into continous sectors start at sector number 0×3fffc0. Notice that 4 bytes of written buffer is the signature of the rootkit – ‘TDL3’ (without quotes). The 824 bytes loader also checks for this signature when it reads back these sectors.

Figure 4. 824 bytes loader check for TDL3 signature

II.4. 3 Rootkit’s direct read/write feature

Another interesting feature of the infector/dropper is its approach to issue read/write/query requests directly to hard disk via the infected miniport driver dispatch routine.

Figure 5. TDL3 uses SCSI requests to write rootkit codes to harddisk

For example, as seen in the Figure 5, in order to write the rootkit’s codes along with the orginal overwritten atapi.sys’s bytes to the last sectors of hard disk, the kernel mode dropper calls a special routine to build an IRP with IO_STACK_LOCATION stack contains an SRB_FUNCTION_EXECUTE_SCSI SCSI_REQUEST_BLOCK which is filled in with appropriate information about write buffer, buffer’s length, sector to write to, the dispatcher’s routine (IdePortDispatch) and target device object. This method has been used before in class drivers such as classpnp.sys and especially implemented in some famous antirootkit tools such as RootkitUnhooker. Figure 6 shows the pseudo-code of TDL3 setting up the SRB before sending requests to infected miniport driver’s dispatch routines.

Figure 6. TDL3 setting up SCSI_REQUEST_BLOCK

III. The TDL3 Rootkit

III.1 File content counterfeiting

The most stand-out feature of TDL/TDSS rootkit family is their ability in hiding the rootkits’ files from scanners. Obviously files are the most important weakness in the gang’s plan to stay under radars. So that’s why the author(s) put so much efforts in to improve their stealthy existences. You can reference DiabloNova’s article in his rootkit.com blog for more information about this rootkit family file-hiding technique evolution.

Not so surprised, it is indisputably still a hide-and-seek game with the mysterious TDL3 rootkit. The author(s) of this rootkit no longer hide their whole files from scanners. In stead, they followed Rustock.C’s trick: counterfeiting the content of infected victim and other protected areas.

And it did pretty well. Currently all fully updated AVs and ARKs out there cannot detect the rogue while it is active. Even if they could, there would be just a little piece of it (e.g the load image notify routine, steathy codes etc,.). All attempts at reading the real infected file’s content simply return innocent and orignial bytes.

How did TDL3 protect itself so effectively?

In order to protect the real content of the infected hard disk miniport driver, the rootkit hooks the the miniport driver object and patches all dispatch routines to the rootkit’s one.

Figure 7. TDL3 patching atapi.sys’s dispatcher table

The rootkit’s hook handler will filter out every IRP IRP_MJ_SCSI type packet traveling through the miniport driver but have interests only in IRP SCSI requests which have SRB function set to SRB_FUNCTION_EXECUTE_SCSI and SRB flags consists of SRB_FLAGS_DATA_IN or SRB_FLAGS_DATA_OUT.

If SRB flags is in combination of SRB_FLAGS_DATA_IN, the hook handler performs the file content counterfeiting by setting a completion routine before forwarding the original IRPs. This completion routine does the dirty stuffs on returned buffers.

The completion routine is illustrated by Figure 8a

Figure 8a. Pseudo code of TDL3 filtering completion routine

NOTE: Protected sectors array is where TDL3 store the information about content-modified sectors: the sector number, length of data to be copied, offset and address of buffer contains original data. Its structure is defined in Figure 8b. The protected sectors in the sample I have are ones which were overwritten with 824 bytes rootkits loader and other atapi.sys areas.

Figure 8b. TDL3 protected sector structure

As shown above, if an application issues one TDL3’s interested SCSI request, the completion routine will loop through the protected sectors array to check whether the requested start sector and number of sector perform operation on fall within one of them. If it does, the rootkit copies the orginal data over the input buffer, returns the application totally fake data.

The rootkit will also zero out request buffer if it’s an attempt at retrieving last sectors of hard disk where rootkit’s code (kernel codes, config.ini, DLLs) is stored.

Figure 9. Pseudo code of TDL3 blocking reading last sectors of disk

TDL3 also adjusts modified parts of infected image in kernel memory so that any memory forensic attempt will fail in detecting suspectious mismatches between hard disk image and the loaded one.

Because the hook takes place in a very low-level miniport driver, all AVs and ARKs have turned into fools relying the forged data returning from the rogue. I believe none of them can detect it without changing the read/write mechanism.

III.2 Anti-Hook detection

Of course, rootkits hook. That’s isn’t new. So before throwing this nasty creature into debugger, I tested it with some most up-to-date version of antirootkits out there to find its hooks: my CodeWalker private version, a_d_13’s RootRepeal, UG North’s RkU, GMER. None of them gave the correct result of TDL3’s dispatcher patches.

Why? After a few debugging sessions, it turned out there was just a small trick to defeat all those above tools. The rootkit simply creates a 11 bytes stub inside the infected driver image space. As you can see on Figure 11, this 11 bytes stub actually transfers the execution flow to real rootkit IRP hook handler remains on kernel pool heap at 0×817e4e31. Because the detection algorithm of all above antirootkit tools basicallly relies only upon checking whether the dispatcher routines’ addresses fall within the range of driver images without analyzing the actually absolute destination of the handlers, thus definitely they would buy the rootkit’s trap.

Figure 10. atapi.sys’s dispatcher table before TDL’s hooks

Figure 11. atapi.sys’s dispatcher table after hooking.

III.3 User-mode injection

Although there’re lots of efforts put in, the rootkit itself is just an “injector” (as the author(s) call it themselves) and injecting the user-mode bot components into processes is its main task.

For that ultimate purpose, the rootkit registers a load image notify routine so that everytime a thread loads “kernel32.dll”, the notify routine will schedule an APC start at LoadLibraryExA to force that thread executing the dropped trojan dlls (tdlcmd.dll and tdlswp.dll) inside user-mode thread’s process. This is the only suspected behaviour that current ARKs are able to detect.

Figure 12. TDL3 DLL injection by schedulingAPC execution

III.4 TDL3 Encrypted File System

As soon as TDL3 kernel mode rootkit is active, the dropper drops 3 files into systems: tdlcmd.dll, tdlswp.dll and config.ini. onto its own storage. In details, TDL3 organizes itself a special storage mode rather than using traditional filesystem:

- Implements a type of RC4 Encrypted File System, reserved within a dynmic amount of hard disk’s last sectors calculated at landing time.
- It creates a simple “partition table” stored at the last sectors of hard disk which is tagged as ‘TDLD’ (which could stand for “TDL Data”) as shown in Figure 13. Inside this table, TDL stores the filenames, their information.

Figure 13. TDL3 “partition table”

- All files are encrypted and stored in the last sectors of hard disk as well, right before TDL’s “partition table” . Each is tagged as “TDLF” – I believe it’s abbreviation of “TDL Files”. Irregularly they’re not written contiguously but backwardly by 2 sectors one by one. Since TDL3’s storage is EFS-model, obviously the content of sectors are RC4 encrypted and decrypted on-the-fly per request transparently to readers. Figure 14 and 15 demonstrates an TDL3 system write request to its EFS. The screenshot was taken while TDL3’s dropper was dropping tdlcmd.dll to disk via trivial API WriteFile().

Figure 14. tdlcmd.dll’s non-encrypted content before being written

Figure 15. After being encrypted with RC4, data is written to disk

- In order to access its files inside its own EFS, TDL3 constructs a random path such as \Device\Ide\IdePort1\enticxfj. to redirect requests into its own filesystem stack. Therefore TDL3 encrypted files are still valid and accessible via ordinary system’s API such as CreateFile(), WriteFile(), etc,.

When the rootkit is reloaded at next reboot, it re-creates another random path similar to above one, then begins the user-mode DLL injection with that random path as in Figure 12.

III.5 TDL3 fun stuff

While trying to harm to victims, the author(s) exposes his good taste of films. In the first sample I have, he chooses one in 4 random quotes from “Fight Club” – a famous action flick filming Brad Pitt in 1999 – and “The Simpsons Movie”, an 2007 funny cartoon – to be displayed as debug string when the filesystem setups finish:

The things you own end up owning you

You are not your fucking khakis

This is your life, and it’s ending one minute at a time

Spider-Pig, Spider-Pig, does whatever a Spider-Pig does. Can he swing, from a web? No he can’t, he’s a pig. Look out! He is a Spider-Pig!

In the second sample retrived in 11/03/2009, these random strings are suddenly changed to other Homer Simpson’s quotes and a special message to malware analysers:

Jebus where are you? Homer calls Jebus!
Dude, meet me in Montana XX00, Jesus (H. Christ)
Spider-Pig, Spider-Pig, does whatever a Spider-Pig does. Can he swing, from a web? No he can’t, he’s a pig. Look out! He is a Spider-Pig!
I’m normally not a praying man, but if you’re up there, please save me Superman.
Alright Brain, you don’t like me, and I don’t like you. But lets just do this, and I can get back to killing you with beer

TDL3 is not a new TDSS!

The author(s) tries to tells us TDL3 isn’t new TDSS. Well, honestly I don’t care, TDL3 or TDSS, it doesn’t matter. The important thing is likely we share a common film favourites, at least.

IV. TDL3 detection

Although being armed with special techniques as described above, there’s some traces this rootkit creates inside systems but it couldn’t clean out due to its mechanism and lacks of protection. For such reason, it’s trivial to detect its existence without executing anything from kernel mode. Currently I’m developing a tool to detect TDL3 in user-mode, yet it’s unstable so the tool will be released as soon as I find it right time (: Of course, I guess soon as it goes out, the author(s) will immediately counteract by modifying current sources for next TDL versions (TDL4, TDL5 etc,.), but that’s the game, isn’t it?

Anyway, technically, what if you want to bypassing its protection from kernel mode? The rootkit uses hook on miniport’s dispatcher table. Therefore one need to get the miniport port dispatch routine manually and transfer SCSI requests without relying on class driver in order to avoid sector content tampering. Or you can implement your own IDE/SCSI miniport driver. Pro is it’s ultimate solution help dealing with future TDL or other type of rootkits which will definitely hook deeper and deeper, lower and lower. However both suggested methods take developers much efforts and time and more important, they aren’t hardware independent.

V. Conclusion

TDL3 is most advanced and stealthiest TDL rootkit I have ever analyzed so far. It operates at the very low levels of Windows storage system and hevily relies on many undocumented concepts such as miniports driver dispatcher routine and other kernel mode objects. This version is a proof of the professionalism approach practised by the gang’s through out its technical evolution. It’s also clear that the gang is watching and reversing 3^rd party ARKs tools to utilize deeper and more sophiticated techniques to be able to counteract malcode scanners. “Low, low and lower” should be enough to describe their motto and current rootkit scene’s today.

VI. Greets, thanks

Thanks go to:

a_d_13: for his generosity to provide me the TDL3 dropper sample, for his review on this analysis & friendly discussion we had.
Frank Boldewin: for his review and his information about a cool rootkit (:
DiabloNova: for his early notification about this TDL version 3 and his review.
TDL3’s author(s): without your works, this analysis would never existed (:

VII. APPENDIX

First look at TDL3 rootkit codes suggest it could be generated automatically from another compiled binary. It uses simple obfuscated string builder, such as

Appendix 1. Obfuscated string builder

Almost every malware reverser uses string as a start to begin his static analysis. In this case, difficulty of listing all strings appear inside the rootkit makes our usual habit useless.

Moveover, the rootkit might stymie static analysis by calling on-the-fly ntosknrl.exe’s APIs despite of importing them as typical binaries do. As a rule, it resolves those routines’ addresses via custom hashes of APIs’ names then passes required arguments whenever it has to call one of them.

Appendix 2. Calling ntoskrnl’s exports on-the-fly

As a matter of fact, the rootkit binaries are very hard to follow in IDA. I made two small Python helper scripts to identify embedded strings and resolve routines’ names by their hashes for better codes understanding and removing mentioned obstacles. You can use them with IDAPython. The scripts can be downloaded from here.

Running two scripts yields very useful information to start static-analysis

Appendix 3. Deobfuscated strings inside TDL3

Appendix 4. Resolved ntosknrl’s exports used by TDL3

Trao đổi với tôi