Flaky test

Flaky test (or flaky test case) is a software test that exhibits non-deterministic behavior, i.e., it may pass or fail inconsistently without any changes in the underlying code.^[1]^[2] This flakiness can arise from a number of causes, including concurrency issues, timing dependencies, reliance on external systems, or lack of sufficient isolation between tests.^[3]^[4] Flaky tests are problematic in continuous integration environments because they decrease trust in automated test suites and can hide real defects, leading to wasted debugging effort and decreased productivity.^[5]

Flaky tests can be mitigated by improving test isolation, controlling sources of nondeterminism (e.g., time and randomness), mocking of external dependencies, and rerunning of tests to confirm failures.^[6]^[7] Flakiness is a well-known issue in large-scale software development, and there are many techniques to tackle it, such as test retries, quarantine mechanisms, or better test design practices.^[8] Nevertheless, flaky tests remain an active research area in software engineering, especially for distributed systems and large code bases where nondeterminism is more prevalent.^[9]

References

^ Luo, Qingzhou; Harman, Mark; Gao, Yuanliang (2016). "A large-scale empirical comparison of static and dynamic test case prioritization techniques". Proceedings of the 2016 24th ACM SIGSOFT International Symposium on Foundations of Software Engineering. pp. 559–570. arXiv:1801.05917. doi:10.1145/2950290.2950344. ISBN 978-1-4503-4218-6.
^ Bell, Jonathan; Legunsen, Owolabi; Hilton, Michael; Eloussi, Lamis; Yung, Tegawendé F. Bissyandé (2018). "D e F laker: Automatically detecting flaky tests". Proceedings of the 40th International Conference on Software Engineering. pp. 433–444. doi:10.1145/3180155.3180164. ISBN 978-1-4503-5638-1.
^ Lam, Wing; Kang, Shin Yoo (2020). "Root causes of flaky tests in software systems". IEEE Transactions on Software Engineering. doi:10.1109/TSE.2019.2905169 (inactive 3 May 2026).{{cite journal}}: CS1 maint: DOI inactive as of May 2026 (link)
^ Gao, Yuanliang; Zhang, Lingming (2019). Understanding flaky tests: The developer's perspective. doi:10.1109/ICSE.2019.00077.
^ Micco, John (2017). "Flaky tests at Google and how we mitigate them". IEEE Software. 34 (3): 56–63. doi:10.1109/MS.2017.34.
^ Eck, William; Palomba, Fabio; De Lucia, Andrea (2021). "Understanding and addressing flaky tests: A systematic literature review". Journal of Systems and Software. arXiv:2010.03303. doi:10.1016/j.jss.2021.110911.
^ Gruber, Markus; Fraser, Gordon (2021). "GUIDER: GUI structure and vision co-guided test script repair for Android apps". Proceedings of the 30th ACM SIGSOFT International Symposium on Software Testing and Analysis. pp. 191–203. doi:10.1145/3460319.3464830. ISBN 978-1-4503-8459-9.
^ Habchi, Sarrah; Papadakis, Mike (2022). "On the use of retries for flaky tests". Empirical Software Engineering. doi:10.1007/s10664-021-10048-3 (inactive 3 May 2026).{{cite journal}}: CS1 maint: DOI inactive as of May 2026 (link)
^ Zhang, Lingming; Elbaum, Sebastian (2020). "Flaky tests in modern software development: Characteristics and mitigation". ACM Computing Surveys. doi:10.1145/3381035 (inactive 3 May 2026).{{cite journal}}: CS1 maint: DOI inactive as of May 2026 (link)

This software article is a stub. You can help Wikipedia by adding missing information.

[1] Luo, Qingzhou; Harman, Mark; Gao, Yuanliang (2016). "A large-scale empirical comparison of static and dynamic test case prioritization techniques". Proceedings of the 2016 24th ACM SIGSOFT International Symposium on Foundations of Software Engineering. pp. 559–570. arXiv:1801.05917. doi:10.1145/2950290.2950344. ISBN 978-1-4503-4218-6.

[2] Bell, Jonathan; Legunsen, Owolabi; Hilton, Michael; Eloussi, Lamis; Yung, Tegawendé F. Bissyandé (2018). "D e F laker: Automatically detecting flaky tests". Proceedings of the 40th International Conference on Software Engineering. pp. 433–444. doi:10.1145/3180155.3180164. ISBN 978-1-4503-5638-1.

[3] Lam, Wing; Kang, Shin Yoo (2020). "Root causes of flaky tests in software systems". IEEE Transactions on Software Engineering. doi:10.1109/TSE.2019.2905169 (inactive 3 May 2026).{{cite journal}}: CS1 maint: DOI inactive as of May 2026 (link)

[4] Gao, Yuanliang; Zhang, Lingming (2019). Understanding flaky tests: The developer's perspective. doi:10.1109/ICSE.2019.00077.

[5] Micco, John (2017). "Flaky tests at Google and how we mitigate them". IEEE Software. 34 (3): 56–63. doi:10.1109/MS.2017.34.

[6] Eck, William; Palomba, Fabio; De Lucia, Andrea (2021). "Understanding and addressing flaky tests: A systematic literature review". Journal of Systems and Software. arXiv:2010.03303. doi:10.1016/j.jss.2021.110911.

[7] Gruber, Markus; Fraser, Gordon (2021). "GUIDER: GUI structure and vision co-guided test script repair for Android apps". Proceedings of the 30th ACM SIGSOFT International Symposium on Software Testing and Analysis. pp. 191–203. doi:10.1145/3460319.3464830. ISBN 978-1-4503-8459-9.

[8] Habchi, Sarrah; Papadakis, Mike (2022). "On the use of retries for flaky tests". Empirical Software Engineering. doi:10.1007/s10664-021-10048-3 (inactive 3 May 2026).{{cite journal}}: CS1 maint: DOI inactive as of May 2026 (link)

[9] Zhang, Lingming; Elbaum, Sebastian (2020). "Flaky tests in modern software development: Characteristics and mitigation". ACM Computing Surveys. doi:10.1145/3381035 (inactive 3 May 2026).{{cite journal}}: CS1 maint: DOI inactive as of May 2026 (link)

[1]

See also

References