RSS Atom Add a new post titled:

Debian LTS Logo

Here is my transparent report for my work on the Debian Long Term Support (LTS) project, which extends the security support for past Debian releases, as a paid contributor.

In April, the monthly sponsored hours were split evenly among contributors depending on their max availability - I declared max 30h and got 17.25h.

Most of my time was spent on frontdesk duties, in particular vulnerabilities (CVE) triaging, so other contributors quickly know what to work on.
In all honesty I spent more time than assigned, as I took upon myself to dig how things work. Fun facts:

  • The (stable, non-LTS) Debian Security Team has a dozen members but the vast majority of the work is done by 2 people - every single day.
  • The main workflow is: import a daily list of new (public) CVEs from MITRE, batch classify for-us/not-for-us, locate information (patches...), determine severity, and possibly fix. I'm not sure how we're notified of private (embargoed) issues, they are rare.
  • The CVE list grew to a 19MB text file, which Git is pathologically bad at handling. Be ready to git gc regularly and forget about git blame (which is annoying when tracking the evolution of a particular vulnerability).
  • We discussed how to justify whether to fix a vulnerability, with topics on funding and light justifications ("minor issue").
  • Dealing with MITRE is still difficult, I couldn't get CVE-2018-19211 properly marked as duplicate and we had to de-dup on our side; however they did right on not rejecting CVE-2018-19217 as I asked since we eventually tracked a totally different affected version.

Anyway, for a more formal report:

  • triage of new and past undetermined vulnerabilities for jessie: samba (dla-needed), evolution-ews (dla-needed + open bug), libpodofo (ignored), claws-mail (dla-needed + open bug), kgb-bot (refresh status), systemd (dla-needed), cacti (dla-needed), wireshark (5 dla-needed, 5 not-affected jessie, 3 not-affected stretch), android-platform-system-core (NFU/not for us), exiv2 (not-affected), spip (not-affected), twitter-bootstrap (no-dsa, minor), ncurses (undetermined to duplicate + already fixed, clarify with upstream and MITRE), xslt (still no info from Apple), wpa (2 ignored + dla-needed), webkit2gtk (unsupported), epiphany-browser (not affected), gradle (dla-needed + open bug), qt4-x11 (dla-needed), libxslt (dla-needed), axis (dla-needed + report wrong link), gpac (dla-needed)
  • ghostscript: jessie-security update, backporting stretch-security's
  • answer user request on debian-lts
  • workflow discussions: double-posting annoucements, justifying (non-)updates
  • doc updates (reference logos page, update mailing-lists URLs, clamav handling, triage process, www update rationale

If you'd like to know more about LTS security, I recommend you check:

Posted Mon Apr 29 11:17:16 2019 Tags:

Debian LTS Logo

In February I had requested to join the Debian LTS project, which extends the security support for past Debian releases, as a paid contributor.
Kuddos to Freexian for pulling this project out.

I was asked to demonstrate a full security update on my own (non paid) which I did with 2 DLAs (Debian LTS Advisory):

  • freedink-dfarc: jessie-security update, applying my own path traversal security fix
  • phmyadmin: jessie-security update, assessing 1 CVE as not affected and fixing another

Incidentally, every Debian Developer can make a direct security upload to jessie-security without prior validation (just follow the guide).

-

Following the spirit of transparency that animates Debian and Debian Security, here's my report for my first paid month.

In March, the monthly sponsored hours were split evenly among contributors depending on their max availability.
I got 29.5h, which I spent on:

  • nettle/gnutls: investigate local side-channel attack and conclude no-dsa / minor issue
  • symfony: helped test Roberto's update
  • sqlalchemy: jessie-security update for SQL injection, tested and discussed upstream's own backported patch
  • glib2.0: investigate denial of service and mark as no-dsa / no reproducible
  • ghostscript: investigate sandbox break and (lack of) test suite, and conclude we'll backport the next upstream release
  • pdns: jessie-security update for the 'remote' backend
  • Fixes/updates in dla-needed.txt, our (public) list of triaged security issues
  • Fixes in LTS wiki, templates and scripts, in particular wrt https://www.debian.org/lts/security/ integration

If you'd like to know more about LTS security, I recommend you check:

Posted Wed Apr 3 13:55:56 2019 Tags:

Android Rebuilds, which provides freely-licensed Android development tools, starts a public repository with F-Droid :)

https://mirror.f-droid.org/android-free/repository/

We're now trying to make it usable for sdkmanager, which should vastly ease the installation process (download what you need, rather than huge mono-version bundles).
Help is welcome to format the repository and possibly adding a way to override sdkmanager's default repositories, join the conversation!

https://forum.f-droid.org/t/call-for-help-making-free-software-builds-of-the-android-sdk/4685

Posted Sun Feb 24 19:28:40 2019 Tags:

I like the Ren'Py project, a popular game engine aimed at Visual Novels - that can also be used as a portable Python environment.

One limitation was that it required downloading games, while nowadays people are used to Flash- or HTML5- based games that play in-browser without having to (de)install.

Can this fixed? While maintaining compatibility with Ren'Py's several DSLs? And without rewriting everything in JavaScript?
Can Emscripten help? While this is a Python/Cython project?
After lots of experimenting, and full-stack patching/contributing, it turns out the answer is yes!

Live demo:
https://renpy.beuc.net/
The Question Tutorial Your game

At last I finished organizing and cleaning-up, published under a permissive free software / open source license, like Python and Ren'Py themselves.
Python port:
https://www.beuc.net/python-emscripten/python/dir?ci=tip
Build system:
https://github.com/renpy/renpyweb

Development in going on, consider supporting the project!
Patreonhttps://www.patreon.com/Beuc

Posted Tue Feb 19 18:38:01 2019 Tags:

As described in a previous post, Google is still click-wrapping all Android developer binaries with a non-free EULA.

I recompiled SDK 9.0.0, NDK r18b and SDK Tools 26.1.1 from the free sources to get rid of it:

https://android-rebuilds.beuc.net/

with one-command, Docker-based builds:

https://gitlab.com/android-rebuilds/auto

This triggered an interesting thread about the current state of free dev tools to target the Android platform.

Hans-Christoph Steiner also called for joining efforts towards a repository hosted using the F-Droid architecture:

https://forum.f-droid.org/t/call-for-help-making-free-software-builds-of-the-android-sdk/4685

What do you think?

Posted Sun Dec 2 14:57:30 2018 Tags:

Why try to choose the host that sucks less, when hosting a single-file (S)CGI gets you decentralized git-like + tracker + wiki?

Fossil

https://www.fossil-scm.org/

We gotta take the power back.

Posted Wed Jun 6 17:12:45 2018 Tags:

I'm working again on making reproducible .exe-s. I thought I'd share my process:

Pros:

  • End users get a bit-for-bit reproducible .exe, known not to contain trojan and auditable from sources
  • Point releases can reuse the exact same build process and avoid introducing bugs

Steps:

  • Generate a source tarball (non reproducibly)
  • Debian Docker as a base, with fixed version + snapshot.debian.org sources.list
    • Dockerfile: install packaged dependencies and MXE(.cc) from a fixed Git revision
    • Dockerfile: compile MXE with SOURCE_DATE_EPOCH + fix-ups
  • Build my project in the container with SOURCE_DATE_EPOCH and check SHA256
  • Copy-on-release

Result:

git.savannah.gnu.org/gitweb/?p=freedink/dfarc.git;a=tree;f=autobuild/dfarc-w32-snapshot

Generate a source tarball (non reproducibly)

This is not reproducible due to using non-reproducible tools (gettext, automake tarballs, etc.) but it doesn't matter: only building from source needs to be reproducible, and the source is the tarball.

It would be better if the source tarball were perfectly reproducible, especially for large generated content (./configure, wxGlade-generated GUI source code...), but that can be a second step.

Debian Docker as a base

AFAIU the Debian Docker images are made by Debian developers but are in no way official images. That's a pity, and to be 100% safe I should start anew from debootstrap, but Docker is providing a very efficient framework to build images, notably with caching of every build steps, immediate fresh containers, and public images repository.

This means with a single:

sudo -g docker make

you get my project reproducibly built from scratch with nothing to setup at all.

I avoid using a :latest tag, since it will change, and also backports, since they can be updated anytime. Here I'm using stretch:9.4 and no backports.

Using snapshot.debian.org in sources.list makes sure the installed packaged dependencies won't change at next build. For a dot release however (not for a rebuild), they should be updated in case there was a security fix that has an effect on built software (rare, but exists).

Last but not least, APT::Install-Recommends "false"; for better dependency control.

MXE

mxe.cc is compilation environment to get MingGW (GCC for Windows) and selected dependencies rebuilt unattended with a single make. Doing this manually would be tedious because every other day, upstream breaks MinGW cross-compilation, and debugging an hour-long build process takes ages. Been there, done that.

MXE has a reproducible-boosted binutils with a patch for SOURCE_DATE_EPOCH that avoids getting date-based and/or random build timestamps in the PE (.exe/.dll) files. It's also compiled with --enable-deterministic-archives to avoid timestamp issues in .a files (but no automatic ordering).

I set SOURCE_DATE_EPOCH to the fixed Git commit date and I run MXE's build.

This does not apply to GCC however, so I needed to e.g. patch a __DATE__ in wxWidgets.

In addition, libstdc++.a has a file ordering issue (said ordering surprisingly stays stable between a container and a host build, but varies when using a different computer with the same distros and tools versions). I hence re-archive libstdc++.a manually.

It's worth noting that PE files don't have issues with build paths (and varying BuildID-s - unlike ELF... T_T).

Again, for a dot release, it makes sense to update the MXE Git revision so as to catch security fixes, but at least I have the choice.

Build project

With this I can start a fresh Docker container and run the compilation process inside, as a non-privileged user just in case.

I set SOURCE_DATE_EPOCH to the release date at 00:00UTC, or the Git revision date for snapshots.

This rebuild framework is excluded from the source tarball, so the latter stays stable during build tuning. I see it as a post-release tool, hence not part of the release (just like distros packaging).

The generated .exe is statically compiled which helps getting a stable result (only the few needed parts of dependencies get included in the final executable).

Since MXE is not itself reproducible differences may come from MXE itself, which may need fixes as explained above. This is annoying and hopefully will be easier once they ship GCC6. To debug I unzip the different .zip-s, upx -d my .exe-s, and run diffoscope.

I use various tricks (stable ordering, stable timestamping, metadata cleaning) to make the final .zip reproducible as well. Post-processing tools would be an alternative if they were fixed.

reprotest

Any process is moot if it can't be tested.

reprotest helps by running 2 successive compilations with varying factors (build path, file system ordering, etc.), and check that we get the exact same binary. As a trade-off, I don't run it on the full build environment, just on the project itself. I plugged reprotest to the Docker container by running a sshd on the fly. I have another Makefile target to run reprotest in my host system where I also installed MXE, so I can compare results and sometimes find differences (e.g. due to using a different filesystem). In addition this is faster for debugging since changing anything in the early Dockerfile steps means a full 1h rebuild.

Copy-on-release

At release time I make a copy of the directory that contains all the self-contained build scripts and the Dockerfile, and rename it after the new release version. I'll continue improving upon the reproducible build system in the 'snapshot' directory, but the versioned directory will stay as-is and can be used in the future to get the same bit-for-bit identical .exe anytime.

This is the technique I used in my Android Rebuilds project.

Other platforms

For now I don't control the build process for other platforms: distros have their own autobuilders, so does F-Droid. Their problem :P

I have plans to make reproducible GNU/Linux AppImage-based builds in the future though. I should be able to use a finer-grained, per-dependency process rather than the huge MXE-based chunk I currently do.

I hope this helps other projects provide reproducible binaries directly! Comments/suggestions welcome.

Posted Sat Jun 2 17:12:59 2018 Tags:

Ever wanted to try this weird GNU FreeDink game, but never had the patience to install it?
Today, you can play it with a single click :)

Play GNU FreeDink

This is a first version that can be polished further but it works quite well.
This is the original C/C++/SDL2 code with a few tweaks, cross-compiled to WebAssembly (and an alternate version in asm.js) with emscripten.
Nothing brand new I know, but things are getting smoother, and WebAssembly is definitely a performance boost.

I like distributed and autonomous tools, so I'm generally not inclined to web-based solutions.
In this case however, this is a local version of the game. There's no server side. Savegames are in your browser local storage. Even importing D-Mods (game add-ons) is performed purely locally in the in-memory virtual FS with a custom .tar.bz2 extractor cross-compiled to WebAssembly.
And you don't have to worry about all these Store policies (and Distros policies^W^W^W.

I'm interested in feedback on how well these works for you in your browsers and devices:

I'm also interested in tips on how to place LibreJS tags - this is all free JavaScript.

Posted Fri May 25 23:37:39 2018 Tags:

Following last week's .zed format reverse-engineered specification, Loïc Dachary contributed a POC extractor!
It's available at http://www.dachary.org/loic/zed/, it can list non-encrypted metadata without password, and extract files with password (or .pem file).
Leveraging on python-olefile and pycrypto, only 500 lines of code (test cases excluded) are enough to implement it :)

Posted Tue Sep 19 19:28:14 2017 Tags:

TL,DR: I reverse-engineered the .zed encrypted archive format.
Following a clean-room design, I'm providing a description that can be implemented by a third-party.
Interested? :)

(reference version at: https://www.beuc.net/zed/)

.zed archive file format

Introduction

Archives with the .zed extension are conceptually similar to an encrypted .zip file.

In addition to a specific format, .zed files support multiple users: files are encrypted using the archive master key, which itself is encrypted for each user and/or authentication method (password, RSA key through certificate or PKCS#11 token). Metadata such as filenames is partially encrypted.

.zed archives are used as stand-alone or attached to e-mails with the help of a MS Outlook plugin. A variant, which is not covered here, can encrypt/decrypt MS Windows folders on the fly like ecryptfs.

In the spirit of academic and independent research this document provides a description of the file format and encryption algorithms for this encrypted file archive.

See the conventions section for conventions and acronyms used in this document.

Structure overview

The .zed file format is composed of several layers.

  • The main container is using the (MS-CFB), which is notably used by MS Office 97-2003 .doc files. It contains several streams:

    • Metadata stream: in OLE Property Set format (MS-OLEPS), contains 2 blobs in a specific Type-Length-Value (TLV) format:

      • _ctlfile: global archive properties and access list
        It is obfuscated by means of static-key AES encryption.
        The properties include archive initial filename and a global IV.
        A global encryption key is itself encrypted in each user entry.

      • _catalog: file list
        Contains each file metadata indexed with a 15-bytes identifier.
        Directories are supported.
        Full filename is encrypted using AES.
        File extension is (redundantly) stored in clear, and so are file metadata such as modification time.

    • Each file in the archive compressed with zlib and encrypted with the standard AES algorithm, in a separate stream.
      Several encryption schemes and key sizes are supported.
      The file stream is split in chunks of 512 bytes, individually encrypted.

    • Optional streams, contain additional metadata as well as pictures to display in the application background ("watermarks"). They are not discussed here.

Or as a diagram:

+----------------------------------------------------------------------------------------------------+
| .zed archive (MS-CBF)                                                                              |
|                                                                                                    |
|  stream #1                         stream #2                       stream #3...                    |
| +------------------------------+  +---------------------------+  +---------------------------+     |
| | metadata (MS-OLEPS)          |  | encryption (AES)          |  | encryption (AES)          |     |
| |                              |  | 512-bytes chunks          |  | 512-bytes chunks          |     |
| | +--------------------------+ |  |                           |  |                           |     |
| | | obfuscation (static key) | |  | +-----------------------+ |  | +-----------------------+ |     |
| | | +----------------------+ | |  |-| compression (zlib)    |-|  |-| compression (zlib)    |-|     |
| | | |_ctlfile (TLV)        | | |  | |                       | |  | |                       | | ... |
| | | +----------------------+ | |  | | +---------------+     | |  | | +---------------+     | |     | 
| | +--------------------------+ |  | | | file contents |     | |  | | | file contents |     | |     |
| |                              |  | | |               |     | |  | | |               |     | |     |
| | +--------------------------+ |  |-| +---------------+     |-|  |-| +---------------+     |-|     |
| | | _catalog (TLV)           | |  | |                       | |  | |                       | |     |
| | +--------------------------+ |  | +-----------------------+ |  | +-----------------------+ |     |
| +------------------------------+  +---------------------------+  +---------------------------+     |
+----------------------------------------------------------------------------------------------------+

Encryption schemes

Several AES key sizes are supported, such as 128 and 256 bits.

The Cipher Block Chaining (CBC) block cipher mode of operation is used to decrypt multiple AES 16-byte blocks, which means an initialisation vector (IV) is stored in clear along with the ciphertext.

All filenames and file contents are encrypted using the same encryption mode, key and IV (e.g. if you remove and re-add a file in the archive, the resulting stream will be identical).

No cleartext padding is used during encryption; instead, several end-of-stream handlers are available, so the ciphertext has exactly the size of the cleartext (e.g. the size of the compressed file).

The following variants were identified in the 'encryption_mode' field.

STREAM

This is the end-of-stream handler for:

  • obfuscated metadata encrypted with static AES key
  • filenames and files in archives with 'encryption_mode' set to "AES-CBC-STREAM"
  • any AES ciphertext of size < 16 bytes, regardless of encryption mode

This end-of-stream handler is apparently specific to the .zed format, and applied when the cleartext's does not end on a 16-byte boundary ; in this case special processing is performed on the last partial 16-byte block.

The encryption and decryption phases are identical: let's assume the last partial block of cleartext (for encryption) or ciphertext (for decryption) was appended after all the complete 16-byte blocks of ciphertext:

  • the second-to-last block of the ciphertext is encrypted in AES-ECB mode (i.e. block cipher encryption only, without XORing with the IV)

  • then XOR-ed with the last partial block (hence truncated to the length of the partial block)

In either case, if the full ciphertext is less then one AES block (< 16 bytes), then the IV is used instead of the second-to-last block.

CTS

CTS or CipherText Stealing is the end-of-stream handler for:

  • filenames and files in archives with 'encryption_mode' set to "AES-CBC-CTS".
    • exception: if the size of the ciphertext is < 16 bytes, then "STREAM" is used instead.

It matches the CBC-CS3 variant as described in Recommendation for Block Cipher Modes of Operation: Three Variants of Ciphertext Stealing for CBC Mode.

Empty cleartext

Since empty filenames or metadata are invalid, and since all files are compressed (resulting in a minimum 8-byte zlib cleartext), no empty cleartext was encrypted in the archive.

metadata stream

It is named 05356861616161716149656b7a6565636e576a33317a7868304e63 (hexadecimal), i.e. the character with code 5 followed by '5haaaaqaIekzeecnWj31zxh0Nc' (ASCII).

The format used is OLE Property Set (MS-OLEPS).

It introduces 2 property names "_ctlfile" (index 3) and "_catalog" (index 4), and 2 instances of said properties each containing an application-specific VT_BLOB (type 0x0041).

_ctlfile: obfuscated global properties and access list

This subpart is stored under index 3 ("_ctlfile") of the MS-OLEPS metadata.

It consists of:

  • static delimiter 0765921A2A0774534752073361719300 (hexadecimal) followed by 0100 (hexadecimal) (18 bytes total)
  • 16-byte IV
  • ciphertext
  • 1 uint32be representing the length of all the above
  • static delimiter 0765921A2A0774534752073361719300 (hexadecimal) followed by "ZoneCentral (R)" (ASCII) and a NUL byte (32 bytes total)

The ciphertext is encrypted with AES-CBC "STREAM" mode using 128-bit static key 37F13CF81C780AF26B6A52654F794AEF (hexadecimal) and the prepended IV so as to obfuscate the access list. The ciphertext is continuous and not split in chunks (unlike files), even when it is larger than 512 bytes.

The decrypted text contain properties in a TLV format as described in _ctlfile TLV:

  • global archive properties as a 'fileprops' structure,

  • extra archive properties as a 'archive_extraprops' structure

  • users access list as a series of 'passworduser' and 'rsauser entries.

Archives may include "mandatory" users that cannot be removed. They are typically used to add an enterprise wide recovery RSA key to all archives. Extreme care must be taken to protect these key, as it can decrypt all past archives generated from within that company.

_catalog: file list

This subpart is stored under index 4 ("_catalog") of the MS-OLEPS metadata.

It contains a series of 'fileprops' TLV structures, one for each file or directory.

The file hierarchy can be reconstructed by checking the 'parent_id' field of each file entry. If 'parent_id' is 0 then the file is located at the top-level of the hierarchy, otherwise it's located under the directory with the matching 'file_id'.

TLV format

This format is a series of fields :

  • 4 bytes for Type (specified as a 4-bytes hexadecimal below)
  • 4 bytes for value Length (uint32be)
  • Value

Value semantics depend on its Type. It may contain an uint32be integer, a UTF-16LE string, a character sequence, or an inner TLV structure.

Unless otherwise noted, TLV structures appear once.

Some fields are optional and may not be present at all (e.g. 'archive_createdwith').

Some fields are unique within a structure (e.g. 'files_iv'), other may be repeated within a structure to form a list (e.g. 'fileprops' and 'passworduser').

The following top-level types that have been identified, and detailed in the next sections:

  • 80110600: fileprops, used for the file list as well as for the global archive properties
  • 001b0600: archive_extraprops
  • 80140600: accesslist

Some additional unidentified types may be present.

_ctlfile TLV

  • 80110600: fileprops (TLV structure): global archive properties
    • 00230400: archive_pathname (UTF-16LE string): initial archive filename (past versions also leaked the full pathname of the initial archive)
    • 80270200: encryption_mode (utf32be): 103 for "AES-CBC-STREAM", 104 for "AES-CBC-CTS"
    • 80260200: encryption_strength (utf32be): AES key size, in bytes (e.g. 32 means AES with a 256-bit key)
    • 80280500: files_iv (sequence of bytes): global IV for all filenames and file contents
  • 001b0600: archive_extraprops (TLV structure): additionnal archive properties (optional)
    • 00c40500: archive_creationtime (FILETIME): date and time when archive was initially created (optional)
    • 00c00400: archive_createdwith (UTF-16LE string): uuid-like structure describing the application that initialized the archive (optional)
      {00000188-1000-3CA8-8868-36F59DEFD14D} is Zed! Free 1.0.188.
  • 80140600: accesslist (TLV structure): describe the users, their key encryption and their permissions
    • 80610600: passworduser (TLV structure): user identified by password (0 or more)
    • 80620600: rsauser (TLV structure): user identified by RSA key (via file or PKCS#11 token) (0 or more)
    • Fields common to passworduser and rsauser:
      • 80710400: login (UTF-16LE string): user name
      • 80720300: login_md5 (sequence of bytes): used by the application to search for a user name
      • 807e0100: priv1 (uchar): user privileges; present and set to 1 when user is admin (optional)
      • 00830200: priv2 (uint32be): user privileges; present and set to 2 when user is admin, present and set to 5 when user is a marked as mandatory, e.g. for recovery keys (optional)
      • 80740500: files_key_ciphertext (sequence of bytes): the archive encryption key, itself encrypted
      • 00840500: user_creationtime (FILETIME): date and time when the user was added to the archive
    • passworduser-specific fields:
      • 80760500: pbe_salt (sequence of bytes): salt for PBE
      • 80770200: pbe_iter (uint32be): number of iterations for PBE
      • 80780200: pkcs12_hashfunc (uint32be): hash function used for PBE and PBA key derivation
      • 80790500: pba_checksum (sequence of bytes): password derived with PBA to check for password validity
      • 807a0500: pba_salt (sequence of bytes): salt for PBA
      • 807b0200: pba_iter (uint32be): number of iterations for PBA
    • rsauser-specific fields:
      • 807d0500: certificate (sequence of bytes): user X509 certificate in DER format

_catalog TLV

  • 80110600: fileprops (TLV structure): describe the archive files (0 or more)
    • 80300500: file_id (sequence of bytes): a 16-byte unique identifier
    • 80310400: filename_halfanon (UTF-16LE string): half-anonymized filename, e.g. File1.txt (leaking filename extension)
    • 00380500: filename_ciphertext (sequence of bytes): encrypted filename; may have a trailing NUL byte once decrypted
    • 80330500: file_size (uint64le): decompressed file size in bytes
    • 80340500: file_creationtime (FILETIME): file creation date and time
    • 80350500: file_lastwritetime (FILETIME): file last modification date and time
    • 80360500: file_lastaccesstime (FILETIME): file last access date and time
    • 00370500: parent_directory_id (sequence of bytes): file_id of the parent directory, 0 is top-level
    • 80320100: is_dir (uint32be): 1 if entry is directory (optional)

Decrypting the archive AES key

rsauser

The user accessing the archive will be authenticated by comparing his/her X509 certificate with the one stored in the 'certificate' field using DER format.

The 'files_key_ciphertext' field is then decrypted using the PKCS#1 v1.5 encryption mechanism, with the private key that matches the user certificate.

passworduser

An intermediary user key, a user IV and an integrity checksum will be derived from the user password, using the deprecated PKCS#12 method as described at rfc7292 appendix B.

Note: this is not PKCS#5 (nor PBKDF1/PBKDF2), this is an incompatible method from PKCS#12 that notably does not use HMAC.

The 'pkcs12_hashfunc' field defines the underlying hash function. The following values have been identified:

  • 21: SHA-1
  • 22: SHA-256

PBA - Password-based authentication

The user accessing the archive will be authenticated by deriving an 8-byte sequence from his/her password.

The parameters for the derivation function are:

  • ID: 3
  • 'pba_salt': the salt, typically an 8-byte random sequence
  • 'pba_iter': the iteration count, typically 200000

The derivation is checked against 'pba_checksum'.

PBE - Password-based encryption

Once the user is identified, 2 new values are derived from the password with different parameters to produce the IV and the key decryption key, with the same hash function:

  • 'pbe_salt': the salt, typically an 8-bytes random sequence
  • 'pbe_iter': the iteration count, typically 100000

The parameters specific to user key are:

  • ID: 1
  • size: 32

The user key needs to be truncated to a length of 'encryption_strength', as specified in bytes in the archive properties.

The parameters specific to user IV are:

  • ID: 2
  • size: 16

Once the key decryption key and the IV are derived, 'files_key_ciphertext' is decrypted using AES CBC, with PKCS#7 padding.

Identifying file streams

The name of the MS-CFB stream is derived by shuffling the bytes from the 'file_id' field and then encoding the result as hexadecimal.

The reordering is:

Initial  offset: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
Shuffled offset: 3 2 1 0 5 4 7 6 8 9 10 11 12 13 14 15

The 16th byte is usually a NUL byte, hence the stream identifier is a 30-character-long string.

Decrypting files

The compressed stream is split in chunks of 512 bytes, each of them encrypted separately using AES CBS and the global archive encryption scheme. Decryption uses the global AES key (retrieved using the user credentials), and the global IV (retrieved from the deobfuscated archive metadata).

The IV for each chunk is computed by:

  • expressing the current chunk number as little endian on 16 bytes
  • XORing it with the global IV
  • encrypting with the global AES key in ECB mode (without IV).

Each chunk is an independent stream and the decryption process involves end-of-stream handling even if this is not the end of the actual file. This is particularly important for the CTS handler.

Note: this is not to be confused with CTR block cipher mode of operation with operates differently and requires a nonce.

Decompressing files

Compressed streams are zlib stream with default compression options and can be decompressed following the zlib format.

Test cases

Excluded for brevity, cf. https://www.beuc.net/zed/#test-cases.

Conventions and references

Feedback

Feel free to send comments at beuc@beuc.net. If you have .zed files that you think are not covered by this document, please send them as well (replace sensitive files with other ones). The author's GPG key can be found at 8FF1CB6E8D89059F.

Copyright (C) 2017 Sylvain Beucler

Copying and distribution of this file, with or without modification, are permitted in any medium without royalty provided the copyright notice and this notice are preserved. This file is offered as-is, without any warranty.

Posted Sun Sep 10 13:18:30 2017 Tags:

This blog is powered by ikiwiki.