Notes about the VHS video format

VHS, or Video Home System, was developed by JVC, and became the most widely adopted cassette format for home video recorders early in the 1980s. However, VHS on its own merely specifies the size and shape of videocassettes, and other more subtle technical specifications. Before one can nail down the format of a VHS cassette, it must be additionally specified how colour video signals are encoded onto the tape. In this respect, there are three main variations of VHS used around the world: PAL, NTSC and SECAM.

This document introduces and explains the VHS video format to those who lack, but seek an intermediate understanding of the format. It assumes a basic understanding of wave physics and electromagnetism as it is applied in analogue signal transmission and recording technologies.

What is video?

Video signals give a human viewer the impression of movement by presenting a series of still frames at a rapid rate. Each still image consists of a series of scan lines which are scanned from top to bottom in order to encode the image. Each line is generated by a constantly varying signal, which drives the phosphors coating a television screen to reflect the colours and intensities present in that frame.

Initially, video was monochromatic, and there was only one signal needed to produce an image: the luminance signal. Two further signals are needed to add colour to the image (because there are three independent signals corresponding to human vision,) and these signals are often combined into one waveform called chrominance.

These video signals are commonly modulated onto separate radio frequency carriers before being transmitted through the air or along an electrical cable. The various cables commonly used with VHS equipment are:

How is video recorded?

VHS is an analogue system which records video signals transversely along a magnetic tape of up to half a kilometre in length and 12.7 mm wide. In common with Philips Compact Cassettes (CCs) for audio, VHS videocassettes have similar types of magnetic coatings on their tape surface, and comments with respect to tape and mechanism wear and degradation apply equally to CC and VHS. However, VHS tapes have only one side, and therefore have only one write-protect tab, which may be removed to prevent recording (covering the hole of a cassette without a tab will undo this protection, as is the case for CC.)

Recording video is no simple matter, which is why it took engineers so long after the invention of television to come up with a practical system to do the job. Because video is more dense, in terms of information to be stored per unit time, more magnetized area must be passed by the heads per unit time than for an audio medium like CC. To allow this, VHS videotapes are much wider than CC, and to utilize this additional area the video is recorded in a series of minutely spaced diagonal strips along the tape surface.

As it is impractical to physically separate the component luminance and chrominance signals on a home video cassette, both signals appear on a single "track." This prevents the use of a simple mapping between magnetization and image colour as was possible between magnetization and the simple audio waveform. Also, because the video information needs to be broken into units called frames, there needs to be some additional logic to deal with the added complications such structure brings. It is also for this reason that the problem of wow and flutter present in recording and playback on an audio medium like CC is more serious in a video system, so there is a lot of compensational mechanics in a VCR to ensure frames are read and written at the correct rate.

Why are there different variations of VHS?

Because sound waves are simply encoded by magnetizing the tape in proportion to the sound level being sampled, the only really useful change to an analogue audio recording system is to vary the speed that the magnetic patterns are recorded at. Once Philips established its Compact Cassette format, there was never any reason to develop differing incompatible formats.

However, the story is much different for recorded video. Had the world had a single TV broadcasting standard in the late 1970s, there would have perhaps been only one type of video encoding. However, TV technology grew up in a very localized fashion, with different countries adopting the TV broadcast standards which best suited their needs. There was no need to standardize TV broadcasting throughout the world, as TV signals didn't travel very far.

It is for this reason that there is no one single flavour of VHS. It is insufficient for one to say, for the purposes of an international exchange of videocassettes, that the format you will be delivering is VHS. Because TV systems are localized between countries, and because video formats are tied to these TV systems, video formats are also localized. Unless you are merely exchanging with your neighbours, there will inevitably be problems.

What are the different television systems, and how do they differ?

Television broadcast systems determine what encoding will be used on a VHS cassette. The overriding difference between television broadcast systems, at least as far as VHS format is concerned, is the method of colour encoding, and thus the various possibilities are often referred to as colour systems.

NTSC (National Television Standards Committee) — This television broadcast system was developed in the United States by the EIA, and is also in use in Japan and Canada. One aim of the EIA was to provide compact TV signals, providing many channels, each occupying a relatively thin bandwidth. The EIA system was the first significant TV broadcast standard created out of those in use today, and the colour version also lead the way in pioneering colour television. Due to its pioneering role, in some technical areas the colour version, arrived at by the NTSC's tweaking of the original EIA system, is deficient.

The main criticism is its typically inferior vertical resolution of 525 scan lines (in the variant called NTSC M,) although the asymmetric nature of the colour encoding[1] used, in combination with other factors, leads to poor colour resolution as well. NTSC typically has 60 fields per second (corresponding to 30 frames of interlaced material per second,) and the pair 525/60 is the NTSC shape, which is designated, along with other technical specifications, by the letter M.

A variant with a 4.43 MHz colour subcarrier frequency called NTSC 4.43 is used in consumer video equipment to fool PAL televisions into displaying some, albeit incorrect, colour information along with the NTSC signal shape. When used for recording, this mode usually results in NTSC shaped signals with PAL colour.

The monochrome variant of NTSC M is named for the committee which standardized it: EIA (i.e. not NTSC N, used in Bolivia, which is closer to CCIR.)

PAL (Phase Alternating Line) — Based on a standard developed by the CCIR, this television broadcast system is common outside the United States, and is used in Australia, New Zealand, Europe, most of SE Asia and parts of the Middle East (PAL B/G,) UK, South Africa and Hong Kong (PAL I) as well as South America (PAL M/N.)

Typical PAL signals produce a crisper display using 625 scan lines, 100 more than NTSC. The typical PAL shape has 50 fields per second which yields a frame rate of 25 frames per second, and the typical PAL shape is denoted by the pair 625/50. Subjectively, the images of PAL are often said to be more vivid and photographic than NTSC, and this can be attributed to the symmetric nature of the colour encoding[2] combined with an increased allocation of luminance bandwidth[3]. PAL is also less susceptible to various forms of colour/luminance interference than NTSC, due to a system leading to the cancellation of phase errors in pairs of lines.

There are several PAL variants, denoted by a number of letters, although this does not typically affect recorded video. The exceptions are PAL M (Brazil) and PAL N (Argentina): these systems are problematic because they are effectively hybrid forms of NTSC and PAL.

The institute CCIR lends its name the monochrome variant of typical PAL.

SECAM (Sequentiel Coleur Avec Memoire)—A television broadcast system similar to PAL, used in some European countries, their former colonies, and around the Asian continent. The main variants are SECAM B/G (Europe and Middle East,) SECAM D/K (China And Russia) and SECAM L (France.) SECAM M (Cambodia and Vietnam) more closely resembles NTSC than PAL, because it uses the M shape and an incompatible colour system.

The variant called MESECAM has two interpretations: as Middle East SECAM it corresponds to SECAM with a B/G transmission mode, but in VCRs it usually refers to SECAM recorded with a false PAL colour signal. The result is monochrome on true SECAM equipment.

What are the corresponding video colour systems?

NTSC — NTSC TV signals can be recorded onto VHS NTSC format videocassettes. Typical NTSC TV signals have a nominal horizontal resolution of 330 lines, but even recorded at 2.2 metres per minute (a faster rate than PAL, meaning less tape density and therefore better resolution,) the corresponding horizontal resolution of VHS NTSC is only 220 lines. Because VHS NTSC is recorded at a faster speed than VHS PAL, NTSC and PAL videocassettes of equal physical length have different playing times. For example, VHS NTSC T-120 videocassette record two hours of video at normal (SP) speed. A corresponding PAL cassette of the same physical length, the 258 metre long E-180, records an extra hour at normal (SP) speed. Why there is a difference at all owes to the poor properties of NTSC as a home recording format—to eliminate disturbing visual interference patterns in early recorders, recordings had to be less dense. The alternating phase of PAL minimized such problems, and as it was developed later, it also benefitted from improved tape transport mechanisms.

PAL — PAL TV signals can be recorded onto VHS PAL format videocassettes. As video technology had progressed since the original VHS NTSC standard was developed in the 1970s (for use in Japan and the US,) when the PAL variant was designed, and because there were less technical problems successfully transferring PAL to videotape, VHS PAL cassettes were capable of recording at 1.4 metres per minute while still achieving a 250 line horizontal resolution. There is therefore a significant degradation from PAL TV signals, with a nominal horizontal resolution of 400 lines, and in relative terms the degradation is much greater than that for NTSC TV to NTSC VHS, but in absolute terms VHS PAL does much better than VHS NTSC. VHS PAL also has the benefit of PAL's 625 scan lines, but does suffer (to some extent) from its lower frame rate of 25 frames per second.

SECAM — SECAM TV signals can be recorded onto VHS SECAM format videocassettes. The physical parameters are similar to PAL, with VHS SECAM recorded at 1.4 metres per minute to yield a horizontal resolution of 250 lines. SECAM colour is problematic for most multisystem VCRs[4], and they use a format which uses false PAL colour signals called MESECAM. This format is monochromatic on true SECAM equipment.

What about audio?

Audio on VHS cassettes is either recorded in thin bands along the edge of the tape (the linear tracks) or in angular tracks "underneath" the video signal (the so-called HiFi stereo soundtrack.)

The recording method for linear tracks is similar to CC, and is standard across all video formats. However, because NTSC sound is recorded at a different rate to PAL or SECAM sound, due the differing tape speeds, and NTSC video cassettes will sound slow and distorted on a PAL VCR, and a PAL video cassette will sound fast and garbled on an NTSC VCR.

The recording method used for VHS HiFi stereo is more closely related to how FM radio is transmitted, and quality is much less dependent on colour system (although you will not be able to hear it at all unless your VCR is set up for the colour system corresponding to the cassette format in use.)

The monaural linear audio track recorded on every VHS cassette (even on VHS HiFi, to make the format mono-compatible) is low fidelity. For instance, the sound on the monaural audio track of a PAL cassette has a sound frequency range of about 70 Hz to 10000 Hz, which sounds slightly muffled and lacks bass. NTSC is only marginally better. When the VHS format was later extended to optionally provide two addition tracks for stereo high fidelity sound, a frequency range of 20 Hz to 20000 Hz became possible. These two tracks are also occasionally used to provide a bilingual monaural recording, with the left channel recording the primary audio and the right channel recording the secondary audio.

Because VHS NTSC was developed first, and because it is recorded at a slower rate than PAL (yielding a greater recording area per unit of time,) there are also NTSC cassettes which have stereo on linear tracks. This is of much reduced quality when compared to VHS HiFi, but historically it provided a useful stopgap solution that is still supported by many publishers (if not equipment manufacturers) today. Because linear audio can be usefully passed through a noise-reduction filter as it is recorded, some tapes are marked "Dolby System on linear tracks," meaning a Dolby B filter has been applied to these tracks to reduce "hiss." Some PAL and SECAM cassettes also bear the marking "stereo on linear tracks," but this is usually left over from the NTSC sleeves: rarely is such audio actually present, and even more rarely is it actually usable. Audiotek do not recognise this audio format beyond this note.

What are SP, LP and EP?

The standard playback speed is SP, and is technically the recommended video cassette playback speed. However, demand for more recording capacity has lead to the development of LP for PAL and NTSC, and EP for NTSC.

SP — Standard Playback VHS is recorded at 1.4 (PAL) or 2.2 (NTSC) metres per minute. The maximum recording time for a SP videocassette (E-300 or T-180) is typically 300 (PAL) or 180 (NTSC) minutes. The quality of SP material is higher than other speeds, because each diagonal strip of recorded video on the tape is separated by a guard band which inhibits confusion of the signals recorded on adjacent bands, allowing signals to be picked up cleanly.

LP — Long Playback VHS is recorded at 0.7 (PAL) or 1.1 (NTSC) metres per minute. The maximum recording time for a LP videocassette (E-300 or T-180) is typically 600 (PAL) or 360 (NTSC) minutes. At LP speed, sound and picture quality begins to deteriorate. This deterioration comes about because there is less space between the diagonal strips, which may actually overlap. This overlap is possible because the two special video heads used in LP mode record at different angles, and this angular difference means that confusion is not a serious problem. Obviously, however, SP is the preferred format when signals need to be clearly picked up.

EP (NTSC) — Extended Playback VHS for NTSC is recorded at 0.5 metres per minute, also known as SLP (Super Long Playback.) At EP speed, sound quality is low fidelity. The maximum recording time for a single EP videocassette (T-180) is typically 540 minutes. EP is available on NTSC only because of that format's degenerate SP variant, which is faster than it needs to be given modern equipment.

Playing VHS Y cassettes in a VHS X country

There are a number of options for playing VHS NTSC cassettes in a country which uses PAL as its primary video system, and similarly for playing VHS PAL cassettes in a country which uses NTSC. However, because more material is available in the NTSC format, it is more difficult in NTSC countries to find cost-effective solutions to play PAL material. This is because the need to play PAL material is much reduced: a vast library of NTSC material is readily available.

Let X be your native video system and Y designate an imported cassette format:

Notes about Audiotek VHS recordings

The catalogue of Audiotek VHS video recordings is the ATKV catalogue. All physically existing cassettes comprising the Audiotek VHS library, whether they be Audiotek or third-party produced, or whether the item be on hand in a local or remote location, donated to another library or on loan, are numbered from 001–899 within the catalogue. Each cassette is allocated a consecutive number as it is finalized. Cassettes may be deleted from this range, and the number reallocated (or not) as required.

The range 900–999 of the ATKV catalogue is reserved for cassettes which are produced by Audiotek specifically for external agents or those which have been created for internal backup purposes. Numbers are allocated to cassettes in consecutive order starting at 900. Allocations in this range are permanent, and there is no requirement that any cassette catalogued actually physically exist in the Audiotek library.

Most Audiotek master video recordings before 1999 were recorded in VHS PAL SP with a hi-fi stereo soundtrack (as well as a mono-compatible linear soundtrack.) Information on the quality of a programme is recorded in the full Audiotek video catalogue. SVHS, a superior variant of VHS which is closer to broadcast quality television and does not suffer from generational degradation, replaced VHS as Audiotek's master video format of choice in 1999. SVHS recordings use a more detailed quality key than that given here: see Audiotek's Notes about the SVHS video format.

The PAL colour system used by most Audiotek recordings scheme uses 625 scan lines to encode each image and 50 fields to present each second of video, and has a frame rate of 25 frames/second. The chrominance is symmetric and consists of R-Y and B-Y signals. The additional designation B/W indicates that the source is "black and white," which means that the image is monochromatic, so that the chrominance information is suppressed, leaving the luminance signal to dominate. The further designation Letterbox or Wide indicates that the image is widescreen:

Wide is the designation applied to prerecorded material which is labelled as "widescreen," so that the full extent of the intended image is shown throughout the material. Letterbox is applied to any broadcast which exhibits a non-4:3 image aspect—there still may be panning and scanning taking place, or the bars may have been added for artistic reasons.

The audio types appearing in Audiotek video catalogue are summarized by the following key:

Audio Key

Linear The sound is recorded on a linear track using a simple encoding. This designation implies a mono source. On VHS PAL SP, one can expect a low signal to noise ratio, considerable hiss, a frequency range of 0.1 to 10 kHz and a maximum wow/flutter of 0.4%. At LP the frequency range is approximately halved, and wow/flutter trebled. Linear signals in general depend on the speed of the surface past the head, and the quality of the surface. Magnetic tapes are graded as IEC I, II or IV, in increasing order of particle density, and hence directly infer quality of the linear signal.
Dolby The recording employs Dolby B noise reduction on the linear track. This improves the signal to noise ratio of the soundtrack, and is typically applied to the linear track of all prerecorded VHS tapes[5].
Stereo The sound is stereophonic and has two components, a left and a right channel, which are recorded as a dual carrier FM azimuth signal in a deep magnetic layer of the tape. Both the left and right signals are high-fidelity, with a potential frequency range of 0.02 to 20 kHz. The azimuth recording employed reduces wow/flutter considerably, but FM encoding does distort the signal a little. The tape speed will affect the frequency response, but the azimuth recording makes this collapse fractional compared with linear audio. However, at slow speeds, tracking of the deep layer signals can be more difficult, leading to a rumbling or noisy distortion on some playback equipment.
Stereo HiFi The signal to noise ratio is guaranteed to be very high. On VHS this means that the signals in the deeper layer are easily tracked by all equipment, owing to a high grade of tape surface and the employment of good quality record heads. (VHS has hi-fi sound recorded as a dual carrier FM azimuth signal in a deep magnetic layer of the tape.) Both the left and right signals are high-fidelity, with a frequency range of 0.02 to 20 kHz. The azimuth recording employed reduces wow/flutter considerably, but FM encoding does distort the signal a little.
Mono The sound is monaural and has only one component, a centre channel, which is recorded in the dual carrier hi-fi FM azimuth signal, but both left & right channels are the same. While signal to noise is often the same as other hi-fi signals, a generation two recording often has a poorer frequency range and increased wow/flutter, owing to the fact the generation one source may have had linear audio.
Mono HiFi The sound is recorded in the dual carrier hi-fi FM azimuth signal, but both left and right channels carry the same signal, making the dual signal redundant. This ensures compatibility with hi-fi stereo signals. Addirionally, this designation confirms that the source of the monaural soundtrack was high fidelity.
Dual Two sound signals have been recorded, the primary and secondary channels. On VHS, this implies the primary audio is recorded in the left hi-fi audio signal and the right audio signal contains secondary audio. The designations Left and Right are used to indicate degenerate recordings where there is only a primary signal.
Stereo + Linear/Dolby Three sound signals have been recorded: the hi-fi stereo audio pair and the linear track. These two channels do not contain corresponding sound information. (This will result on an "audio dubbed" stereo tape.) When the linear track has Dolby B applied, the designation is Stereo+Dolby, otherwise it is Stereo+Linear.
Mono + Linear/Dolby Two sound signals have been recorded: a monaural signal on the hi-fi audio tracks and the linear track. These two channels do not contain corresponding sound information. (This will result on an "audio dubbed" stereo tape.) When the linear track has Dolby B applied, the designation is Mono+Dolby, otherwise it is Mono+Linear.
Dual + Linear/Dolby Three sound signals have been recorded: two independent hi-fi tracks and a linear track. All channels contain independent information. (This will result on an "audio dubbed" bilingual tape.) When the linear track has Dolby B applied, the designation is Dual+Dolby, otherwise it is Dual+Linear.
Dolby Stereo/Mono/Dual This designation is used to indicate non-surround material sourced from Dolby A motion pictures. It additionally implies that a Dolby B filter has been applied to the linear track. (This contrasts with the Dolby definition of "Dolby Stereo.")
Surround The sound is quadraphonic and is encoded from four channels. A Dolby Stereo MP (Motion Picture) matrix encoding is assumed, where these four channels are reduced to a two channel quasi-stereo signal. The four channels are the normal left and right sources, and additionally an in-phase centre source (reduced by 3 dB) and a Dolby B encoded surround source (reduced by 3 dB and filtered to 0.1 to 7 kHz) which is phase shifted by 90 degrees into the left and right outputs. The separation of adjacent channels is then 3 dB[6], producing a balanced quadraphonic sound field. On VHS, the quasi-stereo signal is encoded into the hi-fi FM azimuth carriers as before. A field artifact of 50 Hz for PAL slightly disturbs this signal. Additionally, the Surround designation does not guarantee the signal has not been distorted so as to prevent its intended clear reproduction: see the entry for the Dolby Surround designation.
Dolby Surround The normal designation for movies and other materials on VHS tapes which have been prerecorded. It implies the Surround designation, along with a guarantee of a good signal. It additionally implies that a Dolby B filter has been applied to the linear track.

The subjective and objective quality of a programme from the Audiotek video catalogue is, at least in part, described by the following keys:

Subjective Quality Key

Very High Picture is of a remarkably high standard, does not suffer from any interference, and will only suffer from the normal granularity and variances associated with the video and television systems in use. Audio has a remarkably high signal to noise ratio.
High Picture is of an appreciably high standard, and will not distract the viewer from enjoying the programme. Some ghosting is normal, and slight interference or fuzziness is to be expected, particularly for more dense recordings than SP. Audio is low noise.
Medium Picture suffers from non-variable interference, strong ghosting or distortion. Picture and audio are consistent, but not of a high quality, and could perhaps be improved by advanced image processing or audio hiss reduction.
Low Picture suffers from variable interference or noise, strong variable ghosting, loss of colour, sharpness or contrast. Audio may be muffled or suffer from excessive hiss.

Generation Key (objective quality measure)

(1) The recording was made direct from a source with quality in excess of the medium in use (e.g. a broadcast quality source downconverted to VHS is generation one.) Generation one can be assumed for all High quality recordings which are not designated as being of any other generation.
(2) The recording is a copy made from another recording on a comparable medium, the source recording being generation one. Generation two VHS recordings made by Audiotek's JVC HR-J625EA are further designated by an asterisk, e.g. (2*.) Generation two SVHS recordings made by Audiotek's JVC HR-S5500AM are also designated in this way.
(3+) The recording is a copy of a copy made on a comparable medium. Expect image quality to degrade significantly with the generation index on VHS.

Accurate timings in hours, minutes and seconds are given for all programmes in the Audiotek VHS catalogue. Please refer to individual cassette manifests where these are available.

[1]  The green-purple Q signal is transmitted with half of the colour bandwidth (when compared to red-cyan I,) as the eyes are less sensitive to this complementary colour pair.

[2]  Both R – Y and B – Y are transmitted with the same full colour bandwidth.

[3]  The video channel bandwidth is 4.2 MHz for NTSC/EIA and 5.0 MHz for PAL/CCIR.

[4]  Difficulties arise because SECAM uses frequency modulation on its colour subcarrier. There is therefore a greater risk of intermodulation of chrominance and luminance on video playback, as luminance is also encoded via frequency modulation. PAL and NTSC use quadrature modulation, where frequency is kept constant, amplitude gives colour saturation, and phase gives hue. This is the reason why SECAM is often not recorded with a proper SECAM colour signal.

[5]  Dolby B offers practically no benefit on hi-fi tracks, at least when not used as part of a surround encoding.

[6]  A Dolby Pro-Logic decoder will allow 30 dB channel separation, as it employs an active adaptive matrix decode step.

Exit: Audiotek Press; Audiotek; Archer

Copyright in the material—literary, programmatic, graphic and otherwise—comprising this XHTML document and embedded external elements is claimed by the author, and its publication on this web site does not waive that copyright. The material may not be copied in any form (including printed and electronic forms) excepting the copying actions occuring during the normal course of a HTTP transaction. Anything other than temporary storage in a cache is expressly prohibited.

Author and editor: Kade "Archer" Hansson; e-mail:

Last updated: Sunday 28th May 2000