Standardized Field Sobriety Tests (SFSTs) Expert Witness Services in Michigan

← Return to Expert Witness Services Overview

Introduction

The standardized field sobriety test (SFST) battery is the single most consequential piece of pre-arrest evidence in a Michigan operating-while-intoxicated (OWI) prosecution. The officer's score on these three tests typically determines whether the driver is arrested, whether a chemical test is demanded under the implied-consent statute, and whether the prosecution will proceed under MCL 257.625(1)(a) for operating while intoxicated, MCL 257.625(1)(b) for operating with an unlawful bodily alcohol level, or MCL 257.625(3) for operating while visibly impaired. Yet the battery is also the most frequently misadministered piece of evidence in the case, and the National Highway Traffic Safety Administration (NHTSA) has been explicit in its 2023 Participant Manual, Session 6, page 16, that the validation of the battery applies only when the tests are administered in the prescribed and standardized manner, the standardized clues are used to assess the subject's performance, and the standardized criteria are employed to interpret that performance, and that if any one of the SFST elements is changed, the validity may be compromised.

Michigan courts have qualified me as an expert witness in the SFST battery on numerous occasions over the past fifteen years. This page describes the scientific origin of the battery, the NHTSA-sponsored validation studies, the Michigan statutory framework that governs admissibility, the categories of administrative error that I most frequently identify in the cases I review, and the specific training and courtroom experience that supports my qualification as an expert in this area. The horizontal gaze nystagmus (HGN) test, while also a component of the SFST battery, raises distinct vision-science issues and is addressed in detail on the dedicated Horizontal Gaze Nystagmus (HGN) Expert Witness Services subpage.

Origin and Development of the SFST Battery

The SFST battery did not exist before the mid-1970s. In late 1975, NHTSA contracted with the Southern California Research Institute (SCRI) to identify a small set of roadside tests that could reliably discriminate between drivers above and below a target blood alcohol concentration (BAC). The principal investigators were Dr. Marcelline Burns and Dr. Herbert Moskowitz, and the resulting study—Burns and Moskowitz, Psychophysical Tests for DWI Arrest, Final Report, DOT-HS-802-424 (NHTSA 1977)—is the foundational document for everything that followed.

SCRI began with eleven candidate tests drawn from a nationwide survey of roadside practices: the alphabet recitation, counting backward, grip strength, maze tracing, telegraph-key tapping, finger count, finger-to-nose, tongue twisters, two-point tactile discrimination, color-number naming, and serial performance. Pilot work on small samples eliminated most of those tests as unsuitable, and SCRI then conducted a laboratory study using six tests: walk-and-turn, one-leg stand, alcohol gaze nystagmus, finger-to-nose, finger counting, and drawing on paper. The 1977 report ultimately recommended a three-test battery consisting of alcohol gaze nystagmus, walk-and-turn, and one-leg stand.

The 1977 study reported a false-arrest rate of 46.5 percent, and Burns acknowledged in the report that an error rate of 47 percent in making arrests was not acceptable. The study population was heavily skewed: approximately 80 percent of subjects were in their twenties, and roughly two-thirds were male, as documented at page 18, Figure 4 of the report. These limitations are rarely disclosed to juries.

NHTSA immediately funded a follow-up study that combined further laboratory work with field testing. The result was Tharp, Burns, and Moskowitz, Development and Field Test of Psychophysical Tests for DWI Arrest, Final Report, DOT-HS-805-864 (NHTSA, March 1981). The 1981 study replaced alcohol gaze nystagmus (AGN) with horizontal gaze nystagmus (HGN), refined the standardized administration and scoring procedures, and reported that ten police officers, evaluating 297 participants whose BACs ranged from 0.000 to 0.18, were able to discriminate above and below the then-current 0.10 BAC threshold with reasonable but imperfect accuracy. The 1981 study disclosed, but did not feature, the finding at page 17 that approximately half of the subjects exhibited some nystagmus in at least one eye when their eyes were deviated maximally—a fact that defense experts have repeatedly cited to challenge the specificity of the HGN test.

The 1977 and 1981 studies, considered together, constitute the bulk of the scientific foundation on which the standardized administration and scoring procedures for the walk-and-turn and one-leg stand tests rest. In 1983, the three-test battery was formally validated for the National Highway Traffic Safety Administration, claiming that Horizontal Gaze Nystagmus was 77% accurate, the Walk-and-Turn was 68% accurate, and the One-Leg Stand was 65% accurate. NHTSA has repeatedly revised the training curriculum without revisiting these early studies.

The 0.08 Validation Studies: Colorado, Florida, and San Diego

When Congress and the states moved the per se BAC threshold from 0.10 to 0.08, NHTSA commissioned three field studies to determine whether the SFST battery retained its discriminative power at the lower threshold. These are the studies routinely cited by prosecution witnesses to support the accuracy of the battery.

The Colorado Validation Study—Burns and Anderson, A Colorado Validation Study of the Standardized Field Sobriety Test (SFST) Battery, Final Report (NHTSA Project No. 95-408-17-05, November 1995)—reported that officers' decisions to arrest drivers were correct 93 percent of the time. The Florida Validation Study—Burns and Dioquino, A Florida Validation Study of the Standardized Field Sobriety Test (S.F.S.T.) Battery (NHTSA 1997)—reported a correct-arrest rate of 95 percent. The San Diego Validation Study—Stuster and Burns, Validation of the Standardized Field Sobriety Test Battery at BACs Below 0.10 Percent, Final Report, DOT-HS-808-839 (NHTSA, August 1998)—reported a correct-arrest rate of 91 percent at the 0.08 threshold. Each of these figures is cited at page 8 of Session 8 of the 2023 NHTSA Instructor Manual.

These topline numbers are misleading without context. A close reading of the underlying data reveals significant problems that are rarely disclosed in officer testimony. In Table 4 of the Florida study, 18 percent of individuals whose BAC was below 0.08 nevertheless exhibited five or six HGN clues, and over 50 percent of individuals below 0.08 exhibited at least four HGN clues. Section V, Subsection C, Topic 1 of the Florida study acknowledged that 67 percent of all incorrect arrests of drivers under 0.08 produced all six HGN clues, and that 70 percent of correctly released drivers under 0.08 nonetheless showed two or more clues on the walk-and-turn test. In addition, as documented in Table 3 of the Florida study, in approximately one out of every seven Florida investigations, officers administered non-standardized tests despite knowing that their data was being collected for a NHTSA validation study.

The Colorado study likewise documented incorrectly arrested drivers whose actual BACs were as low as 0.02 to 0.04, as illustrated at Figure 11 of the report.

The San Diego study openly conceded in its Implications section that some of its data were unreliable and that the practical use of the SFSTs in the field is rarely as consistent as the laboratory protocol assumes. It is also worth noting that NHTSA's training materials currently state that Horizontal Gaze Nystagmus is 88% accurate, the Walk-and-Turn is 79% accurate, and the One-Leg Stand is 83% accurate, without explaining why these accuracy claims differ markedly from the 1983 study. These are the magical numbers taught to officers under the current NHTSA curriculum.

The peer-reviewed literature has been even more critical. Hlastala, Polissar, and Oberman, in Statistical Evaluation of Standardized Field Sobriety Tests, 50 Journal of Forensic Sciences 1 (2005), called into question the methodology of the validation studies, and McKnight, Langston, McKnight, and Lange, in Sobriety Tests for Low Alcohol Blood Concentrations, 34 Accident Analysis & Prevention 305 (2002), demonstrated that the battery does not perform well at low alcohol concentrations.

The Three Tests of the Standardized Battery

The SFST battery as currently administered consists of three tests, in a fixed order: horizontal gaze nystagmus, walk-and-turn, and one-leg stand. Each test has a precise administrative protocol and a list of standardized clues, set out in the 2023 NHTSA DWI Detection and Standardized Field Sobriety Testing Participant Manual.

Walk-and-Turn

The walk-and-turn is a divided-attention test designed to require the subject simultaneously to maintain a heel-to-toe stance, listen to instructions, and then perform a sequence of nine heel-to-toe steps in each direction with a prescribed turn. The instructional phase, as set out at pages 42 and 43 of Session 8 of the 2023 NHTSA Participant Manual, requires the officer to demonstrate the heel-to-toe stance, place the subject's left foot on a real or imaginary line, place the right foot on the line ahead of the left with the heel of the right foot against the toe of the left foot, place the arms at the sides, and instruct the subject to maintain that position until the instructions are complete.

The walking phase, set out at page 43 of the same Session 8, requires the officer to instruct the subject to take nine heel-to-toe steps on the line, turn by keeping the front (lead) foot on the line and taking a series of small steps with the other foot, and return nine heel-to-toe steps down the line, all while keeping the arms at the sides, watching the feet, and counting the steps out loud.

Officers are trained to look for, and only for, eight standardized clues identified in the same Session 8: cannot keep balance while listening to the instructions; starts too soon; stops while walking; does not touch heel-to-toe; steps off the line; uses arms to balance; improper turn; and incorrect number of steps. Two or more clues, in NHTSA's training, indicate a BAC at or above the threshold. Counting out loud is not a clue, miscounting is not a clue, and starting too soon means starting before the officer says to begin—not pausing to compose oneself in the heel-to-toe stance.

One-Leg Stand

The one-leg stand is a second divided-attention test. As described at pages 50 and 51 of Session 8 of the 2023 NHTSA Participant Manual, the subject is instructed to stand with the heels together and arms at the sides, raise one leg approximately six inches off the ground with the foot pointed out, and count from 1001 to 1030 while looking at the elevated foot. The officer is to observe the subject from a safe distance for thirty seconds. Officers are trained to look for, and only for, four standardized clues: sways while balancing; uses arms for balance; hops; and puts the foot down. Two or more clues indicate, in NHTSA's training, a BAC at or above the threshold. The height to which the subject elevates the foot is not a clue; NHTSA Session 8, page 51, instructs the officer to observe the subject from a safe distance and to give an instruction to pick the foot up only if the subject puts the foot down. An officer who interrupts the test to instruct the subject to raise the foot higher has injected a non-standardized correction that itself can compromise the validity of the test.

Horizontal Gaze Nystagmus

The HGN test, although the first test administered in the standardized battery as defined at page 13 of Session 1 of the 2023 NHTSA Participant Manual, raises distinct vision-science questions that are addressed in detail on the HGN Expert Witness Services subpage. In summary, the test asks the officer to observe each eye for three clues—lack of smooth pursuit, distinct and sustained nystagmus at maximum deviation, and onset of nystagmus prior to 45 degrees—for a maximum of six clues across both eyes. Four or more clues are taken as indicative of a BAC at or above the threshold.

The Michigan Statutory Framework: MCL 257.62a and MCL 257.625s

Michigan has enacted two statutes that directly govern the admissibility of SFST evidence. The first, MCL 257.62a, defines a standardized field sobriety test as one of the standardized tests validated by NHTSA, and provides that a field sobriety test qualifies as a standardized field sobriety test under the section if it is administered in substantial compliance with the standards prescribed by NHTSA. The phrase "substantial compliance" is the operative limit: a test administered without substantial compliance with NHTSA standards is not a standardized field sobriety test within the meaning of the statute, and the prosecution cannot rely on the validation studies to establish accuracy.

The second statute, MCL 257.625s, governs the admissibility of testimony regarding standardized field sobriety tests. It provides that a person who is qualified by knowledge, skill, experience, training, or education in the administration of standardized field sobriety tests, including the horizontal gaze nystagmus test, shall be allowed to testify subject to a showing of a proper foundation of qualifications. The statute also expressly preserves the admissibility of non-standardized field sobriety tests if the test complies with the Michigan Rules of Evidence—principally MRE 702 and MRE 703.

The interaction of these two statutes is critical. MCL 257.62a defines the universe of "standardized" tests by reference to NHTSA validation. MCL 257.625s permits expert testimony about both standardized and non-standardized tests, but requires the proponent to lay a foundation under the Michigan Rules of Evidence. A test that is purportedly "standardized" but in fact administered without substantial compliance is, in legal effect, a non-standardized test masquerading as a validated procedure—and the prosecution must lay an MRE 702 foundation that it cannot lay.

NHTSA's Strict Standardization Requirement

NHTSA's own training materials are unambiguous on the subject of deviation from the prescribed protocol. As already noted, page 16 of Session 6 of the 2023 NHTSA Participant Manual states that the validation of the battery applies only when the tests are administered in the prescribed and standardized manner, the standardized clues are used to assess the subject's performance, and the standardized criteria are employed to interpret that performance, and that if any one of the SFST elements is changed, the validity may be compromised.

The 2002 NHTSA Instructor Manual, HS 178 R1/02, was even more direct. At page 8, it stated in capitalized text that the standardized field sobriety tests are not at all flexible, and that they must be administered each time exactly as outlined in the course. NHTSA has softened this language in subsequent editions, but the underlying point has never been retracted: the validation studies measure the accuracy of the protocol as administered in the studies, not the accuracy of any officer's improvised approximation of the protocol.

This principle has direct evidentiary consequences. If the officer omits or modifies an instructional element, scores a non-standardized clue as if it were standardized, fails to administer the test in the prescribed sequence, or interrupts the test to provide additional coaching, the prosecution cannot fairly invoke the 91 to 95 percent accuracy figures from the Colorado, Florida, and San Diego studies. The only honest position is that the officer administered an unvalidated variant of the battery whose accuracy is unknown.

Common Administrative Failures Identified in Michigan Cases

Across the OWI cases I have reviewed as an expert witness, certain categories of administrative error recur with predictable regularity. Identifying these errors, and explaining their consequences for the validity of the test, is the core analytical work of an SFST expert.

Improper instructional phase. Officers frequently fail to demonstrate the heel-to-toe stance, fail to instruct the subject to maintain that stance throughout the instructions, fail to confirm comprehension before commencing the test, or commence the test before the subject has acknowledged understanding. Each of these omissions removes one of the divided-attention components that the test was designed to measure.

Non-standardized environmental conditions. The walk-and-turn requires a reasonably dry, hard, level, non-slippery surface and sufficient room for the subject to complete nine heel-to-toe steps in each direction. Tests administered on a sloped shoulder, on gravel, in heavy rain, in snow, or with traffic passing within close proximity raise serious validity concerns that the validation studies did not address.

Improper scoring of non-standardized clues. I frequently encounter officers who score, as if they were standardized clues, behaviors that are not on the closed NHTSA list—for example, failing to count the steps out loud, asking a question during the test, speaking softly, or wobbling during the instructional phase. Counting is not a clue. Talking is not a clue. Failure to maintain the starting position and starting the test before the officer has finished the instructions are the only clues that NHTSA recognizes during the instructional phase. During the walking phase, the only clues are stepping off line, improper number of steps, stopping during the test, improper turn, raising arms for balance, and failing to walk heel-to-toe.

Improper turn instruction. The walk-and-turn requires the officer to instruct the subject to keep the front (lead) foot on the line and turn by taking a series of small steps with the other foot. Officers who simply tell the subject to "turn around" or to "pivot" have administered a different test than the one validated.

Improper one-leg stand instructions. Officers frequently fail to demonstrate the test, fail to instruct the subject to count properly, or interrupt the test to tell the subject to elevate the foot higher than approximately six inches. The height of foot elevation is not a clue, and officer interruptions to correct foot height are non-standardized interventions.

Inappropriate test subjects. NHTSA training expressly states that the walk-and-turn and one-leg stand are not validated for individuals over 65 years of age, individuals more than 50 pounds overweight, individuals with back, leg, or middle-ear conditions, or individuals wearing heels more than two inches tall. Tests administered to subjects in these categories yield results whose meaning is undefined.

Application to drug-impaired drivers. The SFST battery was developed and validated to discriminate alcohol-impaired drivers from sober drivers. It has never been validated for marijuana, prescription medications, or polysubstance use. The published decision in People v Bowden, 344 Mich App 171 (2022), and the National Highway Traffic Safety Administration's 2017 Report to Congress, Marijuana-Impaired Driving: A Report to Congress, both acknowledge that no scientifically validated method exists to detect marijuana-impaired driving with the reliability of alcohol testing.

Non-Standardized Roadside Tests

Officers in Michigan continue to administer roadside tasks that have no place in the validated battery: the alphabet recitation, counting backward, finger counting, "pick a number," and various improvised exercises. NHTSA's own materials address this practice expressly. At page 10 of Session 6 of the 2023 NHTSA Participant Manual, the alphabet, finger count, and counting backward are listed as pre-exit interview techniques to help officers conduct quick checks at sobriety checkpoints, and the materials state in capitalized text that these techniques do not replace the SFSTs. The alphabet, finger-count, and finger-to-nose, for example, were all rejected by SCRI in the original Burns and Moskowitz 1977 research as having insufficient evidential value to be included in the test battery.

Under MCL 257.625s, evidence of a non-standardized field sobriety test is admissible only if it complies with the Michigan Rules of Evidence. In practice, this means the prosecution must satisfy MRE 702 and the principles of Daubert v Merrell Dow Pharmaceuticals, Inc, 509 US 579 (1993), as adopted by the Michigan Supreme Court in Gilbert v DaimlerChrysler Corp, 470 Mich 749 (2004). The proponent must establish that the test is the product of reliable principles and methods and that the officer applied those principles reliably to the facts of the case. Most non-standardized roadside tests cannot survive that inquiry.

My Training and Experience with the SFST Battery

I completed the NHTSA/IACP Standardized Field Sobriety Testing Practitioner Certification Course in April 2005, and I have served as a trainer at NHTSA/IACP Live Alcohol Workshops on multiple occasions, including in 2007, 2009, 2012, and 2017. The Live Alcohol Workshop is the practical component of the certification course in which trainee officers administer the battery to dosed volunteer subjects whose actual BACs are known. Serving as a trainer in this setting has given me direct, repeated experience administering and observing the SFST battery on hundreds of individuals across the full range of BACs, from sober to well above the per se threshold.

I have also completed the NHTSA/IACP Advanced Roadside Impaired Driving Enforcement (ARIDE) training event in 2017, advanced training in horizontal gaze nystagmus from a vision-science perspective in 2008, and continuing education at the National College for DUI Defense, the Mastering Scientific Evidence in DWI/DUI Cases program, and a litany of other programs. I have studied the peer-reviewed literature on field sobriety testing and the underlying NHTSA technical reports rather than relying on summary slides. The relevant peer-reviewed work includes Hlastala, Polissar, and Oberman (2005), already cited above; Rubenzer and Stevenson, Horizontal Gaze Nystagmus: A Review of Vision Science and Application Issues, Journal of Forensic Sciences (2010), and Kane & Kane, The high reported accuracy of the standardized field sobriety test is a property of the statistic not of the test, Law, Probability and Risk, Volume 20, Issue 1, March 2021, Pages 1–13, https://doi.org/10.1093/lpr/mgab004.

From 2021 through 2024, I served as an Adjunct Professor of Forensic Science at Madonna University, where I taught FOR 4650 Ethics & Expert Testimony and FOR/CJ 5230 Criminal Law and the Rules of Evidence. The course content addressed, in substantial part, the evidentiary standards governing forensic testimony of the kind at issue in OWI prosecutions.

Courtroom Qualification and Legislative Testimony

I have been qualified by Michigan courts as an expert witness on the standardized field sobriety test battery on numerous occasions. The earliest of these qualifications occurred in 2011 in the 35th District Court for Wayne County and the Genesee County District Court, and the qualifications have continued through the present, including expert reports and testimony in a variety of courts across the state.

In 2007, I represented the defendant in People v Wyrybkowski, Court of Appeals Case No. 283673, in which the Michigan Court of Appeals held that a defense expert witness must be permitted to testify under Daubert in a challenge to the horizontal gaze nystagmus test. The Wyrybkowski decision established that the trial-court gatekeeping function under MRE 702 applies symmetrically to both prosecution and defense expert testimony on field sobriety testing.

In 2015, I provided testimony before the Michigan Senate Judiciary Committee regarding standardized field sobriety testing in the course of the legislature's consideration of amendments to the Michigan Vehicle Code. I have also lectured extensively on SFST issues, including Cross Examination on Flawed Field Sobriety Tests for the Michigan Association of OWI Attorneys (MIAOWIA) in 2024, Introduction to Field Sobriety Testing for the Macomb County Bar Association in 2015, and Introduction to Field Sobriety Testing and Statistical Flaws in the SFST Battery at the Criminal Defense Attorneys of Michigan (CDAM) Summer Session in 2012, and I authored Witness Preparation and Examination for DUI Proceedings, published by West in 2012.

Scope of Engagement

A typical SFST expert engagement begins with my review of the police narrative report, the officer's SFST data sheet (or its absence, which is itself a finding), the in-car and body-worn camera video, the officer's training records and any continuing-education certificates, the dispatch and citation records, and any contemporaneous medical or physical-condition information that would bear on the subject's ability to perform the tests. Where appropriate, I also review the preliminary breath test (PBT) records and the evidential breath test results, although those issues are addressed primarily on the Breath Testing Expert Witness Services subpage.

I then prepare a written expert report that identifies, on a test-by-test and clue-by-clue basis, every departure from the NHTSA-prescribed protocol, every non-standardized clue improperly scored, every environmental factor that compromised the validity of the test, and every recorded behavior that the officer either failed to score or improperly attributed to impairment. The report sets out the applicable Michigan statutory and evidentiary framework, identifies the peer-reviewed literature that bears on the reliability questions at issue, and explains—in language accessible to the trier of fact—why the deviations identified compromise the validation studies on which the prosecution will attempt to rely.

Retention

I accept SFST expert witness engagements from defense attorneys throughout Michigan and, on a case-by-case basis, in neighboring jurisdictions. Inquiries may be directed to Maze Legal PLC, 37211 Goddard Road, Romulus, Michigan 48174, or by telephone at (734) 941-8800.

Return to Expert Witness Services Overview →