program testing can be used very effectively to show the presence of bugs but never to show their absence.
1 - Overview
Continuous Integration relies heavily on automated testing
Writing tests (i.e. self-testing code) is a requirement for software developers. It is important to know that these tests can be categorized into different types (e.g. Unit Test, Acceptance Test, etc). Unfortunately, the internet contains varying definitions for each of these test types and it can be quite confusing to developers.
In this article you will learn about the different types of tests and how the Fabric-Team differentiates between them.
This article mainly focuses on testing back-end applications, however it can also be applied to front-end apps.
2 - Test Types
broad-stack tests have the advantage of exercising the application with all its parts connected together and thus can find bugs in the interaction between components in the way that component tests cannot. However broad-stack tests also tend to be harder to maintain and slower to run than component tests
|
Test Type |
Description |
Other | |
|---|---|---|---|
|
Unit Test Component Test |
|
*Test.java | |
|
Integration Test |
|
*IT.java | |
|
End-to-End Test E2E Test Full-Stack Test Broad-Stack Test |
|
*IT.java same as Integration Test | |
|
Cross-System Test |
|
*XT.java | |
|
Cross-System E2E Test |
|
? | |
|
Acceptance Test User Acceptance Test |
|
*AT.java | |
|
Performance Test |
|
*PT.java | |
| |||
3 - Commonality Between All Test Types
Each Test Should be Predictable not Flaky
When running the same test multiple times it should either: always fail or always pass. A test like that is a Flaky Test. Mock anything that is unpredictable.
Each Test Should Mock External Systems (Generally), except Cross-System Test
When dealing with a datastore, queries are typically mocked. In other cases, an embedded datastore is used in place of the environment’s datastore. This also prevents data corruption. I prefer using embedded databases for all tests, then have a Cross-System Test to assert connection to the actual database (maybe even test basic CRUD operations). If a test uses an actual database, make sure the database state is the same before and after running the test (e.g. if the test adds a row, delete it at the end of the test)
When dealing with an external application, do the same as with a datastore. For example, API calls should be mocked.
Long Running Tests Should Not be Part of the Build Process (Preferably)
We like builds to be to run as fast as possible.
If a long running test must be part of a build, then find ways to shorten it. Perhaps mock long running processes. Or cut down redundancy (e.g. instead of persisting 100 objects to a database, persist just 1 or 2).
Each Test Should have a Perfect Balance Between DAMP & DRY
Readability matters, it doesn’t hurt to have duplication in tests, if it improves readability.
- DAMP (Descriptive and Meaningful Phrases) increases maintainability by reducing time necessary to read and understand code
- DRY (Don’t Repeat Yourself) increases maintainability by isolating change (risk) to only parts of the system that must change
I lean towards DRY as reused code means less code to read
Each Test Should test the Function/Behavior Not the Implementation (when possible)
A test should concern itself with the result, not the steps to the result. For example, calling the method square(x) with argument 2 should return 4, we shouldn’t worry about the bit manipulation.
A plus side is that refactoring code doesn’t not require much change to the test. This is because, the implementation changed but not the behavior.
Each Test Should have the form: Given-When-Then (Preferably)
The idea is to break down a scenario/test into 3 sections:
- given - describes the state of the world before the behavior is tested (usually state setup is done within @Before or the within the actual test method)
- when - is the behavior that is being tested
- then - describes the changes expected or the state of the world after the specified behavior
Sometimes, we write this into the name of the test method. For example:
public void givenFourUsersInDB_whenCreateNewUser_thenFiveUsersInDB() { … }
Click here to expand...
- https://martinfowler.com/bliki/GivenWhenThen.html
- other similar styles:
- Meszaros describes the pattern as Four-Phase Test. His four phases are Setup (Given), Exercise (When), Verify (Then) and Teardown
- Bill Wake came up with the formulation as Arrange, Act, Assert
Each Test Should have the form: Arrange Act Assert (AAA) (Preferably)
The idea is to group the code within a single test into 3 parts. Which somewhat reflects the Given-When-Then naming format
Click here to expand...
public void givenFourUsersInDB_whenCreateNewUser_thenFiveUsersInDB() { // 1 - Arrange db.insert(user1) db.insert(user2) db.insert(user3) db.insert(user4) // 2 - Act testObject.createNewUser(...) // 3 - Assert assertEquals(5, db.findAllUsers().count()) // 4 - Reset db.deleteAllUsers() }
4 - Other
5 - Not Curated
- unit testing - testing one unit of code (sometimes a unit is translated to a java class)
- should constrain the behavior of the unit under test. An unfortunate side effect is that sometimes, tests also constrain the implementation
- component testing - testing multiple units of code
- integration testing - testing between 2 units of code at their integration point
- verifies the communication paths and interactions between units/components to detect interface defects
- end-to-end testing - testing across several units within a single application
- the value of end-to-end testing: https://www.symphonious.net/2015/04/30/making-end-to-end-tests-work/
- test pyramid -
- https://martinfowler.com/articles/practical-test-pyramid.html
- https://martinfowler.com/bliki/TestPyramid.html
- “the pyramid is based on the assumption that broad-stack tests are expensive, slow, and brittle compared to more focused tests, such as unit tests. While this is usually true, there are exceptions. If my high level tests are fast, reliable, and cheap to modify - then lower-level tests aren’t needed
- assertion free testing - testing without assertions (usually done just to pass code coverage)
- integration testing - similar to end-to-end testing defined above
- narrow integration tests - runs external dependencies locally
- broad integration tests - calls out to real external dependencies
- code coverage - commonly mistaken as a quality target metric. code coverage only finds untested code and that is all
- It is important to constantly question the value a unit test provides versus the cost it has in maintenance or the amount it constrains your implementation. By doing this, it is possible to keep the test suite small, focussed and high value
- Business Facing Test -
- usually these tests are defined via the Given-When-Then style (cucumber is a nice framework)
- https://martinfowler.com/bliki/BusinessFacingTest.html
- Specification by Example (SBE) is a collaborative approach to defining requirements and business-oriented functional tests for software products based on capturing and illustrating requirements using realistic examples instead of abstract statements
- In their original (and common) world view, each time you implement a new UserStory you add one or more tests. This leads you to a simple tracing structure where each story is verified by one or more acceptance tests. But the problem with this approach is that over time the tests grow in complexity with much duplication. In their new world view there is a suite of acceptance tests that describe the application behavior in SpecificationByExample style. Each time they play a new story, they decide how to update this suite to reflect the new behavior. This breaks the simple story-to-test relationship, but results in a much simpler and coherent suite of tests ~ excerpt from https://martinfowler.com/bliki/NashvilleProject.html
- Test Impact Analysis (TIA) - https://martinfowler.com/articles/rise-test-impact-analysis.html#CreationOfSuitesAndTags