複合二進制文件

複合二進制文件(CFBF),簡稱複合文件[1]英文全名 Composite Document File V2 Document[2],是微軟開發、用於實現COM結構化儲存的檔案格式,用於把多個物件內容存放在同一個硬碟檔案。[3][4][5]

Microsoft已經開放了這個檔案格式。廣泛用於Microsoft WordMicrosoft Access。也是Advanced Authoring Format英語Advanced Authoring Format的基礎。[6]

檔案結構 編輯

CFBF檔案頭部是512位元組,隨後跟着是儲存數據的磁區。磁區的長度在檔案頭部指定,通常是512位元組或4096位元組。

CFBF的磁區類型:

  • FAT磁區
  • MiniFAT磁區:用於Mini-Stream
  • Double-Indirect FAT (DIFAT)磁區 - 包含FAT磁區索引的鏈結串列數據
  • Directory磁區
  • Stream磁區 - 包含數據內容
  • Range Lock磁區 - 包含大檔案的上鎖的位元組範圍

CFBF檔案頭 編輯

CFBF頭是該檔案的最前的512位元組。對應於C語言數據結構為:

 typedef unsigned long ULONG;    // 4 Bytes
 typedef unsigned short USHORT;  // 2 Bytes
 typedef short OFFSET;           // 2 Bytes
 typedef ULONG SECT;             // 4 Bytes
 typedef ULONG FSINDEX;          // 4 Bytes
 typedef USHORT FSOFFSET;        // 2 Bytes
 typedef USHORT WCHAR;           // 2 Bytes
 typedef ULONG DFSIGNATURE;      // 4 Bytes
 typedef unsigned char BYTE;     // 1 Byte
 typedef unsigned short WORD;    // 2 Bytes
 typedef unsigned long DWORD;    // 4 Bytes
 typedef ULONG SID;              // 4 Bytes
 typedef GUID CLSID;             // 16 Bytes

 struct StructuredStorageHeader { // [offset from start (bytes), length (bytes)]
     BYTE _abSig[8];             // [00H,08] {0xd0, 0xcf, 0x11, 0xe0, 0xa1, 0xb1,
                                 // 0x1a, 0xe1} for current version
     CLSID _clsid;               // [08H,16] reserved must be zero (WriteClassStg/
                                 // GetClassFile uses root directory class id)
     USHORT _uMinorVersion;      // [18H,02] minor version of the format: 33 is
                                 // written by reference implementation
     USHORT _uDllVersion;        // [1AH,02] major version of the dll/format: 3 for
                                 // 512-byte sectors, 4 for 4 KB sectors
     USHORT _uByteOrder;         // [1CH,02] 0xFFFE: indicates Intel byte-ordering
     USHORT _uSectorShift;       // [1EH,02] size of sectors in power-of-two;
                                 // typically 9 indicating 512-byte sectors
     USHORT _uMiniSectorShift;   // [20H,02] size of mini-sectors in power-of-two;
                                 // typically 6 indicating 64-byte mini-sectors
     USHORT _usReserved;         // [22H,02] reserved, must be zero
     ULONG _ulReserved1;         // [24H,04] reserved, must be zero
     FSINDEX _csectDir;          // [28H,04] must be zero for 512-byte sectors,
                                 // number of SECTs in directory chain for 4 KB
                                 // sectors
     FSINDEX _csectFat;          // [2CH,04] number of SECTs in the FAT chain
     SECT _sectDirStart;         // [30H,04] first SECT in the directory chain
     DFSIGNATURE _signature;     // [34H,04] signature used for transactions; must
                                 // be zero. The reference implementation
                                 // does not support transactions
     ULONG _ulMiniSectorCutoff;  // [38H,04] maximum size for a mini stream;
                                 // typically 4096 bytes
     SECT _sectMiniFatStart;     // [3CH,04] first SECT in the MiniFAT chain
     FSINDEX _csectMiniFat;      // [40H,04] number of SECTs in the MiniFAT chain
     SECT _sectDifStart;         // [44H,04] first SECT in the DIFAT chain
     FSINDEX _csectDif;          // [48H,04] number of SECTs in the DIFAT chain
     SECT _sectFat[109];         // [4CH,436] the SECTs of first 109 FAT sectors
 };

FAT磁區 編輯

每個FAT條目長度4位元組,包含下個FAT鏈結串列條目的磁區號,或下述特定值:

  • FREESECT (0xFFFFFFFF) - 未用磁區
  • ENDOFCHAIN (0xFFFFFFFE) - FAT鏈結串列最後一個磁區
  • FATSECT (0xFFFFFFFD) - 用於FAT數據儲存
  • DIFSECT (0xFFFFFFFC) - 用於DIFAT數據儲存

詞彙表 編輯

  • FAT - 檔案分配表,也稱 SAT - 磁區分配表
  • DIFAT - Double-Indirect File Allocation Table
  • FAT Chain - 一群FAT條目指出分配給一個流的那些磁區
  • Stream - 一個數據物件的內容
  • Sector - 磁區,佔據512或4096個位元組

參見 編輯

參考文獻 編輯

  1. ^ Apache POI - POIFS. POI Project. [10 May 2011]. (原始內容存檔於2011-04-26). 
  2. ^ How to convert documents between LibreOffice and Microsoft Office file formats on Linux. [25 Nov 2016]. (原始內容存檔於2019-09-21). 
  3. ^ Compound Files (Windows). Microsoft Developers Network (MSDN) library – COM SDK. Microsoft Corporation. 20 November 2008 [23 September 2009]. (原始內容存檔於2016-03-07). 
  4. ^ Containers: Compound Files. Microsoft Developers Network (MSDN) library – Visual Studio 2008 documentation. Microsoft Corporation. [23 September 2009]. (原始內容存檔於2012-10-18). 
  5. ^ Understand Compound Files. Microsoft Developers Network (MSDN) library – ActiveDirectory Rights Management. 25 June 2009 [23 September 2009]. (原始內容存檔於2016-03-09). 
  6. ^ AMW Association (formerly AAF Association) 互聯網檔案館存檔,存檔日期15 August 2000.

外部連結 編輯