Reproduced with modifications on the OpenKJ Project site with permission from Jim Bumgardner.
You can find the original document as published by Jim here.
This document has been indispensable to me while developing the OpenKJ hosting software. Without the work he did and shared documenting this, there is a very good chance that OpenKJ would never have existed. It can't be overstated how much I appreciate Jim's generosity in making this available. Thanks Jim!
I've made some minor changes to this document for formatting or clarity.
A CD+G is a special Audio Compact disc that contains graphics data in addition to the audio data on the disc. The disc can be played on a regular Audio CD player, but when played on a special CD+G player, can output a graphics signal (typically, the CD+G player is hooked up to a television set).
This file describes how CD+G information is stored on the disc, and provides information to help programmers implement CD+G playback on personal computers equipped with CD+G compatible CD-ROM drives. This document assumes a familiarity with computer programming, and especially the C programming language.
Ever since I released a utility for the Macintosh that allows owners of Apple's CD300 to play CD+G discs (CDG Player), I have received numerous requests asking for the CD+G specification. Unfortunately, this information has not been made available to the general public (as far as I know) and is hard to find. Hopefully, this informal description will help.
It has been a while since I read the CD+G specification. I am describing this system from memory, and will often deviate from the nomenclature used in the actual CD Audio and CD+G specification.
Note from Isaac: While the karaoke world has mostly moved on to media+g files (mp3+g and the like) instead of physical CD+G discs, the cdg files that we use are the exact same format as what's described below. For files ripped from CD+G discs, the cdg file is literally just a raw copy of the sub-channel data from the original disc. For cdg files produced by modern download vendors, they're just generating data in the same format as it would exist on a CD+G disc's sub-channels.
A Compact Disc contains two kinds of data: Content data, which is used to store Audio and Computer software and the like, and sub-channel data, which is normally used by the CD player to help control the disc.
Assuming the disc is being read at normal speed (approx 150 kps) for audio playback, the data is read at 75 sectors per second. In each sector there are 2352 bytes of Content data and 96 bytes of sub-channel data.
On the disc, the sub-channel data is interspersed with the audio data. But when the disc is played back, the two streams of data are separated. On a typical CD-ROM player you can't access the sub-channel information - it is never sent to the computer. In fact, until recently on most CD-ROM players, you couldn't read the Content data on an audio track either. You could only read the Content data a data track which is used specifically for storing computer information. Presumably, this was done to prevent people from easily pirating Audio CDs.
More recently however, manufacturers have started producing drives that allow the computer to access not only the Content data on an audio track, but also the sub-channel data. As we shall see, it is this sub-channel capability which is needed for a computer to be able to play a CD+G disc.
The 96 bytes of sub-channel information in each sector contain 4 packets of 24 bytes apiece.
The structure of the packet is as follows:
typedef struct {
char command;
char instruction;
char parityQ[2];
char data[16];
char parityP[4];
} SubCode;
Each byte in the 16 byte SubCode.data[] field can be thought of as being divided into 8-bits. Each of these bits corresponds to a separate stream of information.
These streams are called "channels", and are labeled starting with the letter P, like so:
Channel# P Q R S T U V W
Bit# 7 6 5 4 3 2 1 0
---------------
Byte # 0 0 0 0 0 0 0 0 0
Byte # 1 0 0 0 0 0 0 0 0
Byte # 2 0 0 0 0 0 0 0 0
.
. (etc.)
.
Byte # 15 0 0 0 0 0 0 0 0
Both the P and Q channels, on a regular Audio CD are used for timing information. They are used to assist the CD Player in tracking the current location on the disc, and to provide the timing information for the time display on the CD Player.
On a regular Audio CD, channels R through W are unused.
The CD+G format takes advantage of the unused channels R thru W. These unused six bits are used to store graphics information. Note that this is an extremely thin stream of information. 6 bits per byte * 16 bytes per packet * 4 packets per sector * 75 sectors per second = 28800 bits per second, or 3.6 K per second. By comparison, a typical 160 x 120 QuickTime movie uses 90K per second.
The CD+G standard is published by Philips and Sony as an extension of the Red Book. It defines a specific system for decoding the sub-channels R thru W. Note: if you gain sufficient control of the CD mastering process, you could define your own Audio-compatible format which is potentially more interesting than CD+G. For example, you could use the sub-channel to store control information for an animatronics robot, MIDI information, etc. There already is a CD+G+MIDI extension of the CD+G standard, but I won't describe it here.
Also, there is an "Extended" CD+G standard which allows for more colors and effects in a backward compatible manner, but I have yet to see any discs published in this format, so I will only describe the basic format.
In the CD+G system, 16 color graphics are displayed on a raster field which is 300 x 216 pixels in size. The middle 294 x 204 area is within the TV's "safe area", and that is where the graphics are displayed. The outer border is set to a solid color. The colors are stored in a 16 entry color lookup table.
Note from Isaac: When writing a parser, I recommend that you output video frames that ONLY contain the safe area. This is necessary to make scrolling work/display properly. See the detailed section on ScrollCopy & ScrollPreset for more info.
Each color in the table is drawn from an RGB space of 4096 (4 bits each for R,G and B) colors.
Since we are using a 16 color table, each pixel can be stored in 4 bits, resulting in a total pixelmap size of 4 * 300 * 216 = 259200 bits = a little less than 32K.
The SubCode.command and SubCode.instruction fields in each SubCode packet are used to determine how the data is to be interpreted. I will enumerate these commands below. When reading the command and instruction fields, you should only pay attention to the lower 6 bits of each of these bytes (e.g. by ANDing the byte with the hex value 3Fh). The upper 2 bits are used for the P and Q channels which you should ignore.
If the lower 6 bits of the command field are equal to 9, then the SubCode packet contains CD+G information, otherwise it should be ignored.
If the packet contains CD+G information, then you use the instruction field (again, the lower six bits), to determine how to interpret the data. An example will be shown below.
A more detailed description of each instruction appears later in this document.
Number | Name | Description |
---|---|---|
1 | Memory Preset | Set the screen to a particular color |
2 | Border Preset | Set the border of the screen (unsafe area) to a particular color |
6 | Tile Block (Normal) | Load a 12x6, 2 color tile and display it normally |
20 | Scroll Preset | Scroll the image, filling in the new area with a color |
24 | Scroll Copy | Scroll the image, rotating the bits back around |
28 | Define Transparent Color | Define a specific color as being transparent |
30 | Load Color Table (Low) | Load in the lower 8 entries of the color table |
31 | Load Color Table (High) | Load in the upper 8 entries of the color table |
38 | Tile Block (XOR) | Load a 12x6, 2 color tile and display it using the XOR method |
This code example shows how you might implement the graphics display dispatcher for a CD+G player in C.
#define SC_MASK 0x3F
#define SC_CDG_COMMAND 0x09
#define CDG_MEMORYPRESET 1
#define CDG_BORDERPRESET 2
// etc... - define each instruction listed above
ReadSubCodePacket(&subCode);
if ((subCode.command & SC_MASK) == SC_CDG_COMMAND) { // CD+G?
switch (subCode.instruction & SC_MASK) {
case CDG_MEMORYPRESET: MemoryPreset(subCode.data); break;
case CDG_BORDERPRESET: BorderPreset(subCode.data); break;
// etc... - implemement each instruction listed above
}
}
This is the (possibly) hard part. If your CD-ROM drive doesn't support this feature, you are shit-out-of-luck. If the *Driver* software that is used to control your CD-ROM drive doesn't support this feature, you are also shit-out-of-luck, although less so, since you might be able to write a new driver, or obtain a new one. Good luck.
As far as I know, the only way this step can be easily solved (today), it to use an Apple Macintosh equipped with an Apple CD300 drive or better. Both this drive, and the driver that is included with it support SubCode access. Apple publishes a document called "CD-ROM Driver Calls" which is available from ftp.apple.com. It describes the newer driver calls that allow you to retrieve SubCode packets on the Apple CD300 drive. These calls often do not work on 3rd party drives. I used the information in this document to code a function based on the Mac Toolbox call PBControl, which will retrieve SubCode packets.
Here I describe the CD+G instructions in detail, hopefully providing enough information for a programmer to implement CD+G playback.
Note that in all these instructions, only the lower 6 bits of each data byte are used. So, for example, If I refer to subCode.data[0], I actually mean (subCode.data[0] & 0x3F).
subCode.instruction 1
In this instruction, the 16 byte data field is interpreted as follows:
typedef struct {
char color; // Only lower 4 bits are used, mask with 0x0F
char repeat; // Only lower 4 bits are used, mask with 0x0F
char filler[14];
} CDG_MemPreset;
Color refers to a color to clear the screen to. The entire screen should be cleared to this color.
When these commands appear in bunches (to insure that the screen gets cleared), the repeat count is used to number them. If this is true, and you have a reliable data stream, you can ignore the command if repeat != 0.
subCode.instruction 2
In this instruction, the 16 byte data field is interpreted as follows:
typedef struct {
char color; // Only lower 4 bits are used, mask with 0x0F
char filler[15];
} CDG_BorderPreset;
Color refers to a color to clear the screen to. The border area of the screen should be cleared to this color. The border area is the area contained with a rectangle defined by (0,0,300,216) minus the interior pixels which are contained within a rectangle defined by (6,12,294,204).
Normal : subCode.instruction 6
XOR: subCode.instruction 38
These commands load a 12 x 6 tile of pixels from the subCode.data area. I recall that in the original CD+G spec, the tile is refereed to as a "font", but I think the word tile is more appropriate, because the tile can (and does) contain any graphical image, not just text. Larger images are built by using multiple tiles. The XOR variant is a special case of the same command, the difference is described below.
The tile is stored using 1-bit graphics. The structure contains two colors which are to be used when rendering the tile. The tile is extracted from 16 bytes of subCode.data[] in the following manner:
typedef struct {
char color0; // Only lower 4 bits are used, mask with 0x0F
char color1; // Only lower 4 bits are used, mask with 0x0F
char row; // Only lower 5 bits are used, mask with 0x1F
char column; // Only lower 6 bits are used, mask with 0x3F
char tilePixels[12]; // Only lower 6 bits of each byte are used
} CDG_Tile;
color0, color1 describe the two colors (from the color table) which are to be used when rendering the tile. color0 is used for 0 bits, color1 is used for 1 bits.
Row and Column describe the position of the tile in tile coordinate space. To convert to pixels, multiply row by 12, and column by 6.
tilePixels[] contains the actual bit values for the tile, six pixels per byte. The uppermost valid bit of each byte (0x20) contains the left-most pixel of each scanline of the tile.
In the normal instruction, the corresponding colors from the color table are simply copied to the screen.
In the XOR variant, the color values are combined with the color values that are already onscreen using the XOR operator. Since CD+G only allows a maximum of 16 colors, we are XORing the pixel values (0-15) themselves, which correspond to indexes into a color lookup table. We are not XORing the actual R,G,B values.
Scroll Preset: subCode.instruction 20
Scroll Copy: subCode.instruction 24
In these instruction, the 16 byte data field is interpreted as follows:
typedef struct {
char color; // Only lower 4 bits are used, mask with 0x0F
char hScroll; // Only lower 6 bits are used, mask with 0x3F
char vScroll; // Only lower 6 bits are used, mask with 0x3F
} CDG_Scroll;
This command is used to scroll all the pixels on the screen horizontally and/or vertically.
The color refers to a fill color to use for the new area uncovered by the scrolling action. It is only used in the Scroll Preset command. In the Scroll Copy command the screen is "rotated" around. For example, in scrolling to the left, pixels uncovered on the right are filled in by the pixels being scrolled off the screen on the left.
The hScroll field is a compound field. It can be divided into two fields like so:
SCmd = (hScroll & 0x30) >> 4;
HOffset = (hScroll & 0x07);
SCmd is a scrolliing instruction, which is either 0, 1 or 2.
0 means don't scroll
1 means scroll 6 pixels to the right
2 means scroll 6 pixels to the left
HOffset is a horizontal offset which is used for offsetting the graphic display by amounts less than 6 pixels. It can assume values from 0 to 5.
Similarly, the vScroll field is a compound field. It can be divided into two fields like so:
SCmd = (vScroll & 0x30) >> 4;
VOffset = (vScroll & 0x0F);
SCmd is a scrolliing instruction, which is either 0, 1 or 2.
0 means don't scroll
1 means scroll 12 pixels down
2 means scroll 12 pixels up.
VOffset is a vertical offset which is used for offsetting the graphic display by amounts less than 12 pixels. It can assume values from 0 to 11.
Note from Isaac: The hOffset and vOffset values should be used to shift the safe area's view of the underlying pixmap data by the number of pixels specified horizontally or vertically. The underlying full image (safe area + border) should not actually be scrolled by this offset. It's basically moving the safe area viewport around inside the full cdg pixmap without changing any of the underlying pixmap data. vOffset shifts the viewport down by the number of pixels specified while hOffset shifts it to the right by the number of pixels specified. Since the offsets are always positive (unsigned), this is always in the down and/or to the right direction.
Smooth horizontal and vertical scrolling in all directions can be done by combining scroll commands. For example, here is a smooth horizontal scroll to the left:
SCmd HScroll
==== =======
0 1 // No actual scroll, just offset safe area viewport 1px right of normal
0 2 // No actual scroll, just offset safe area viewport 2px right of normal
0 3 // No actual scroll, just offset safe area viewport 3px right of normal
0 4 // No actual scroll, just offset safe area viewport 4px right of normal
0 5 // No actual scroll, just offset safe area viewport 5px right of normal
0 6 // No actual scroll, just offset safe area viewport 6px right of normal
2 0 // Scroll 6px left and remove safe area offset
(repeat)
You can create the effect of an infinite panorama by continually loading in new tiles into the border area and scrolling them into view.
subCode.instruction 28
This command is used to define a CLUT color as being transparent, for example so that the the graphics can be overlayed on top of a live video signal.
Note from Isaac: This command is almost never used in professionally produced CDG discs or mp3+g cdg files. It can generally be safely ignored. I don't know what the data structure is, but I suspect it's almost identical to that used by the BorderPreset command, since it's just passing a color in.
Low: subCode.instruction 30
High: subCode.instruction 31
These commands are used to load in the colors for the color table. The colors are specified using 4 bits each for R, G and B, resulting in 4096 possible colors. The only difference between the two instructions is whether the low part or the high part of the color table is loaded.
In these instruction, the 16 byte data field is interpreted as follows:
typedef struct {
short colorSpec[8]; // AND with 0x3F3F to clear P and Q channel
} CDG_LoadCLUT;
Each colorSpec value can be converted to RGB using the following diagram:
[---high byte---] [---low byte----]
7 6 5 4 3 2 1 0 7 6 5 4 3 2 1 0
X X r r r r g g X X g g b b b b
Note that P and Q channel bits need to be masked off (they are marked here with Xs).
The Load Color Table commands are often used in CD+G to provide the illusion of animation through the use of color cycling.