字節

位元組的次方單位
十進制前綴; (SI)
二進制前綴; (IEC 60027-2)
	閱; 論; 編;

位元組（英語：byte）是通常用在電腦、手機及智能手錶等設備上的信息計量單位，不分數據類型。^[1]^[2] 。一個字節代表八個位元。從歷史的觀點上，「字節」表示用於編碼單個字符所需要的比特數量，因此它是許多計算機體系結構中最小的可尋址內存單元。歷史上字節長度曾基於硬件為1-48 bit不等，最初通常使用6 bit或9 bit為一字節。今日標準以8 bit作為一字節。為了消除常見8 位定義中任意大小的字節的歧義，八個位元在一些規範（例如工業標準、計算機網絡、電信技術等）中常被稱為八位組（octet）。Internet 協議（RFC 791 ) 將 8 位字節稱為八位字節。^[3]

國際電工委員會(IEC) 和電氣與電子工程師協會(IEEE) 將字節的單位符號指定為大寫字母 B。例如MB表示兆字節（megabyte）；位元（bit）可縮寫成b，例如Mb表示兆比特（ megabit（英語：megabit）），與字節進行區分。國際上，單位八位字節（octet，符號 o）明確定義了八位的序列，消除了術語「字節」的潛在歧義。

字節的大小歷來取決於硬件，並且不存在強制規定大小的明確標準。曾經使用過的字節的大小包含 1 到 48 位。六位字符代碼是早期編碼系統中常用的實現方式，使用六位和九位字節的計算機在 20 世紀 60 年代很常見。這些系統通常具有 12、18、24、30、36、48 或 60 位的存儲器字，對應於 2、3、4、5、6、8 或 10 個六位字節。在術語「字節」變得普遍之前，指令流中的位分組通常被稱為syllable^[a]或slab 。

ISO/IEC 2382-1:1993 中記錄的現代事實上的標準（8位）是相對方便的2 的冪，因為 2 的 8 次方是 256，允許一個字節使用 0 到 255的二進制編碼值。國際標準IEC 80000-13定義了這一常見含義。許多類型的應用程序使用可用八位或更少位表示的信息，並且處理器設計者通常針對這種用法進行優化。主要商業計算架構的普及有助於 8 位字節的普遍接受。現代架構通常使用 32 位字或 64 位字，分別由 4 個或 8 個字節構成。

歷史

字節一詞是Werner Buchholz於1956年6月在IBM Stretch計算機的早期設計階段發明的，該計算機的尋址為位和可變字段長度（VFL）指令，指令中編碼了字節大小。這是為了避免意外突變為比特而特意重寫的。

注釋

^ 術語「syllable」用於包含指令或指令組成部分的字節，而不是數據字節。

參考資料

^ Blaauw, Gerrit Anne; Brooks, Jr., Frederick Phillips; Buchholz, Werner, 4: Natural Data Units, Buchholz, Werner (編), Planning a Computer System – Project Stretch (PDF), McGraw-Hill Book Company, Inc. / The Maple Press Company, York, PA.: 39–40, 1962 [2017-04-03], LCCN 61-10466, （原始內容 (PDF)存檔於2017-04-03）, […] Terms used here to describe the structure imposed by the machine design, in addition to bit, are listed below.
Byte denotes a group of bits used to encode a character, or the number of bits transmitted in parallel to and from input-output units. A term other than character is used here because a given character may be represented in different applications by more than one code, and different codes may use different numbers of bits (i.e., different byte sizes). In input-output transmission the grouping of bits may be completely arbitrary and have no relation to actual characters. (The term is coined from bite, but respelled to avoid accidental mutation to bit.)
A word consists of the number of data bits transmitted in parallel from or to memory in one memory cycle. Word size is thus defined as a structural property of the memory. (The term catena was coined for this purpose by the designers of the Bull fr computer.)
Block refers to the number of words transmitted to or from an input-output unit in response to a single input-output instruction. Block size is a structural property of an input-output unit; it may have been fixed by the design or left to be varied by the program. […]
^ Bemer, Robert William, A proposal for a generalized card code of 256 characters, Communications of the ACM, 1959, 2 (9): 19–23, doi:10.1145/368424.368435
^ Postel, J.. Internet Protocol DARPA INTERNET PROGRAM PROTOCOL SPECIFICATION. September 1981: p. 43 [28 August 2020]. RFC 791 （英文）. octet An eight bit byte.

延伸閱讀

Tafel, Hans Jörg. 寫於RWTH, Aachen, Germany. Einführung in die digitale Datenverarbeitung [Introduction to digital information processing]. Munich, Germany: Carl Hanser Verlag. 1971: 300. ISBN 3-446-10569-7 （德語）. Byte = zusammengehörige Folge von i.a. neun Bits; davon sind acht Datenbits, das neunte ein Prüfbit (NB. Defines a byte as a group of typically 9 bits; 8 data bits plus 1 parity bit.)
Programming with the PDP-10 Instruction Set (PDF). PDP-10 System Reference Manual 1. Digital Equipment Corporation (DEC). August 1969 [2017-04-05]. （原始內容存檔 (PDF)於2017-04-05）.
Computer History Museum – Exhibits – Internet History – 1964: Internet History 1962 to 1992. Computer History Museum. 2017 [2015] [2017-04-03]. （原始內容存檔於2017-04-03）.
Jaffer, Aubrey. Metric-Interchange-Format. 2011 [2008] [2017-04-03]. （原始內容存檔於2017-04-03）.
Kozierok, Charles M. The TCP/IP Guide – Binary Information and Representation: Bits, Bytes, Nibbles, Octets and Characters – Byte versus Octet. 3.0. 2005-09-20 [2001] [2017-04-03]. （原始內容存檔於2017-04-03）.

參閲

八位元組（octet）

外部連結

ГОСТ 8.417-2002 | Страница 19 （俄文）

[4] 術語「syllable」用於包含指令或指令組成部分的字節，而不是數據字節。

[Buchholz_1962-1] Blaauw, Gerrit Anne; Brooks, Jr., Frederick Phillips; Buchholz, Werner, 4: Natural Data Units, Buchholz, Werner (編), Planning a Computer System – Project Stretch (PDF), McGraw-Hill Book Company, Inc. / The Maple Press Company, York, PA.: 39–40, 1962 [2017-04-03], LCCN 61-10466, （原始內容 (PDF)存檔於2017-04-03）, […] Terms used here to describe the structure imposed by the machine design, in addition to bit, are listed below.
Byte denotes a group of bits used to encode a character, or the number of bits transmitted in parallel to and from input-output units. A term other than character is used here because a given character may be represented in different applications by more than one code, and different codes may use different numbers of bits (i.e., different byte sizes). In input-output transmission the grouping of bits may be completely arbitrary and have no relation to actual characters. (The term is coined from bite, but respelled to avoid accidental mutation to bit.)
A word consists of the number of data bits transmitted in parallel from or to memory in one memory cycle. Word size is thus defined as a structural property of the memory. (The term catena was coined for this purpose by the designers of the Bull fr computer.)
Block refers to the number of words transmitted to or from an input-output unit in response to a single input-output instruction. Block size is a structural property of an input-output unit; it may have been fixed by the design or left to be varied by the program. […]

[Bemer_1959-2] Bemer, Robert William, A proposal for a generalized card code of 256 characters, Communications of the ACM, 1959, 2 (9): 19–23, doi:10.1145/368424.368435

[3] Postel, J.. Internet Protocol DARPA INTERNET PROGRAM PROTOCOL SPECIFICATION. September 1981: p. 43 [28 August 2020]. RFC 791 （英文）. octet An eight bit byte.

[1]

[2]

[3]

[a]