The Overrated 4-Ball Test: Why It Fails in Practical Lubricant Assessment

by | Articles, Lubricants


In the intricate world of lubrication and tribology, the 4-ball test has become a core experimental procedure and is commonly listed as a performance parameter on lubricant and grease data sheets.

These tests are designed to measure friction and a lubricant’s wear resistance and are indispensable tools in rapidly screening lubricant and grease formulations.

However, end users often cite the results of 4-ball testing as primary criteria in selecting gear and circulating oils. Should this be the case? Let’s examine what the 4-ball test is and its limitations.

Understanding 4-Ball Wear and Weld Tests

At its heart, the 4-ball test is elegantly simple. The test involves three stationary steel balls arranged in a triangular formation in a cup, with a fourth ball, held by a chuck, rotating against them.

This assembly is often bathed in the lubricant under examination, ensuring a thorough assessment of its properties under simulated conditions of use. By adjusting the speed of the rotating ball and the load applied to it, tribologists and lubricant formulators can simulate several conditions.

The basic setup for a 4-ball rig. Note that the red circles denote point contacts between the balls.

A distinctive feature of the 4-ball test is its ability to evaluate two critical aspects: the weld load and the wear scar. The weld load test (ASTM D2783 for lubricants and ASTM D2596 for greases) is primarily concerned with determining the extreme pressure properties of the lubricant.

This procedure progressively increases the load until welding between the balls is detected. In applications involving high pressures and loads, the base oil viscosity is often insufficient to prevent metal-to-metal contact. In these boundary lubrication regimes, extreme pressure additives in the form of sulphurised olefins, fatty acids, and solid lubricants are often required to protect machine surfaces.

In contrast, the wear scar test (ASTM D2266 for greases and ASTM D4712 for lubricants) focuses on the lubricant’s wear-preventive characteristics.

Here, a constant load is applied, and the test measures the diameter of the wear scars on the stationary balls after a predetermined period. The wear scar’s size indicates the lubricant’s ability to protect against wear, a vital attribute in prolonging machinery life and ensuring smooth operation.

From the Laboratory Bench to Reality in the Field

As with all bench tests, the 4-ball test attempts to create a reliably repeatable condition that can be performed relatively inexpensively and in much less time than would be required for field trials. In this respect, it has succeeded immensely, and 4-ball test rigs are a common feature of tribology labs worldwide.

However, the conditions inside the test rig bear little resemblance to those seen in machinery. There are two components to this – the size of the interaction between the balls and the interaction type.

Concerning the size of the interaction, the intersection between two spheres is a point. Theoretically, the test load is being placed through a vanishingly small area, drastically increasing the surface pressure.

Additionally, the contact is a sliding interaction. This combination is rarely observed in machinery, where the most severe combinations are line contact with sliding (as in journal bearings) or point contact with rolling (ball bearings).

Interaction between spur gears and cylindrical roller bearings – typically a “line” contact

So, it is established that the test rig does not accurately simulate real-world contacts, but is there any downside to using the test results to indicate lubricant performance?

A Disconnect Between Test Results and Performance

In a 2008 research paper, members of the FZG Institute in Germany evaluated test methods for gear lubricants. Among other curiosities, the paper assessed many of the bench test methods available and their relevance to real-world performance.

Figure from “Test Methods for Gear Lubricants” – Hoehn, B-R, Oster, P, Tobie, T, Michaelis, K (2008)

A surprising result from the paper was the relative performance of common household liquids such as milk and beer, which scored higher on 4-ball weld load than a non-EP mineral ISO 220 gear oil and an antiwear ISO 46 hydraulic oil.

Yet on the FZG scuffing test, the relative performance of these three liquids was as we would expect, with the performance of the AW hydraulic oil achieving the highest rating, followed by the mineral gear oil, milk, and beer, respectively.

As the FZG test more closely simulates actual conditions inside industrial gearing (the interaction between two specified gear profiles), this would be more reflective of real-world performance.

As a development test, the 4-ball weld and wear scar test carries considerable weight. It is inexpensive and repeatable and can give tribologists and formulators an idea if their formulation is directionally correct. But as a tool for lubricant selection, we should be skeptical and look to other test procedures to indicate the likely performance.


  • Rafe Britton

    Hi, I’m Rafe Britton, the Lubrication Expert. I’m known within the industry for my YouTube channel and podcast, and I work with mid-size industrials to improve their equipment uptime while reducing the cost of their lubrication program. I’m a mechanical engineer with 13 years of experience on both sides of the industry, both as an operator and lubricant supplier. I hold a Bachelor of Aerospace Engineering and a Bachelor of Physics from UNSW. I’ve helped dozens of industrial clients upskill their workforce, reduce waste, improve reliability and take great strides toward their corporate sustainability goals. I serve on the Australian Lubricant Association technical committee to push the lubricants industry forward and help end-users better understand the importance of lubrication. Website | YouTube Channel | Podcast