One of the main problems that forensic scientists face is that of human error in forensic testing, and those errors often lead to faulty results. This is especially tragic when such results lead to wrongful accusations of individuals on the basis of those errors. The causes of such human error are many. For some forensic scientists, emotions may play a role in causing errors, or even fatigue. In contrast to this propensity for human error, machine learning, or Artificial Intelligence (AI), offers the prospects of more accurate and more efficient results in forensic testing.1 But the problem with enabling Artificial Intelligence to do this testing is splitting the scientific field into two groups: the supporters and the reluctant. Meredith Whittaker, a research scientist at New York University, when talking about the marketing methods in AI programming companies, said that the companies tend to show the bright side of utilizing Artificial Intelligence, and ignore the many misuses and bias-related problems that take place when using Artificial Intelligence. Also, the policy director for OpenAI, which is an organization that researches the benefits of AI, Jack Clark, said that the AI system is exposed to technical issues that could result in mismatching and misclassification errors.2 In contrast, James Bewes and others write that Artificial Intelligence can be used rapidly and without falling into bias issues as humans do.3 Given this debate among scholars, is it possible to utilize Artificial Intelligence in forensic science to enhance forensic analysis and avoid the pitfalls of human error in testing? Despite the concerns of some scholars, I argue that, in fact, the use of Artificial Intelligence in forensic testing is not only beneficial to those who employ it, but necessary for enhancing different fields of forensic science by providing support that helps the judgment of the forensic scientists and helps improve the methods that are used to analyze collected data.
What exactly is Artificial Intelligence, and how it works? One scholar, David Poole, has defined it this way: “Artificial intelligence, or AI, is the field that studies the synthesis and analysis of computational agents that act intelligently.”4 Computer programmers write problem-solving algorithms, or step-by-step instructions on how to solve specific types of problems, and each time the program succeeds at solving a problem, it ‘learns’ to apply that algorithm to successive problems, thus increasing its ‘intelligence’ with each new problem-solution routine. The goal is to design a human-like digital brain that can make decisions based on not only data, as computers regularly do, but also make decisions based on experience, as humans do. For example, the AlphaZero computer program was taught the basics of chess, and the program started learning the game by itself. After 24 hours, it was able to play like a superhuman chess player.5 Programmers can design such intelligent agents to be used in a number of ways. In forensic science, the method of AI that is most useful is known as “supervised learning.”6 This type of AI is designed to predict results based on its inputs. For example, supervised machine learning takes input data, like temperature or time, and then analyzes that data to project results, or predictions like true or false, male or female, identified or unidentified.
The first forensic field that could benefit from Artificial Intelligence is forensic anthropology because it is more related to recognition and decision-making. In this field, Artificial Intelligence can be used to determine the location of a burned body and the temperatures that that body was exposed to by analyzing the color of the bone remains and its structure. In fact, bone coloration has been a diagnostic tool in forensic anthropology for a long time. According to Douglas Ublekar, bone coloration was used to determine the temperature a body was exposed to: “David’s 1990 experimental work found that in a controlled brushfire, no calcination occurred and fragments appeared brown and black. A campfire size burn using eucalyptus wood reached a temperature of 840 °C in one hour and five minutes and produced white, calcined fragments with some grey and black coloration,” which indicates that forensics were using bone color change quite some time ago.7 However, this method is not accurate due to the external factors involved in the coloration process, such as the amount of oxygen, the surface that the burning happened on, and the number of soft tissues.8 To avoid such an issue, scientists can use another method, which is to measure the crystallite of bones after the burning. However, in this method, the crystallite will start to appear when the temperature reaches 400 C with 10 nm, then keep rising as the temperature rises till it reaches its maximum with 120 nm at 800 C; above 800 C, the crystallite will stay at the same amount.9 Therefore, we are already using Artificial Intelligence to make the process as accurate and as durable as possible and to overcome obstacles that make bone coloration harder to observe.
Sebastian Wärmländer and his team used Artificial Intelligence in their research for the same purpose of determining the temperature of burns through the color of bones. The result was significant. They successfully used K-NN, which is a simple algorithm-learning machine, to determine the temperature. As a result “this small-sample pilot study indicates that machine-learning algorithms can be used to estimate rather accurate heating temperatures from digital color measurements of bone surface.”10 In this experiment, Wärmländer used pig bones as the body exposed to heat. The pig bones were exposed to heat at different temperatures constantly at certain times. Then Wärmländer used a spectrophotometer (Konica-Minolta, Osaka, Japan) to read the colors of the bones in a dark room. After collecting the data, the K-NN algorithm machine learning was able to determine the temperature remarkably, with 41.6°C median prediction errors. In addition, the experiment showed the materials that reacted with the bones. Based on that, scientists can now use Artificial Intelligence to determine not only the temperature but also the surface the body was on.11 As shown previously, Wärmländer used the K-NN algorithm, which is one of the simplest machine-learning algorithms. In fact, if we used an advanced machine, the results would be much more efficient. Also, machine learning will improve based on the number of experiments the machine deals with. Therefore, over time, the results that the machine provides will be even more accurate.
Using Artificial Intelligence in anthropology is not limited to estimating the temperature a bone was exposed to. Another usage is to determine the sex based on a skull image. James Bewes and his team used “GoogLeNet,” which is a much more advanced machine learning. This type of machine is in fact “supervised learning,” which is used to identify photos and label them. In this experiment, Bewes and his team transferred 1000 2D photos to the machine divided into 500 females and 500 males between the age of 18 and 60. The skulls were from different ancestry, “with the six most common ancestries being English (37.3%), Australian (30.9%), Scottish (8.2%), Irish (8.1%), German (6.8%) and Italian (6.8%). The three largest ethnic groups of non-European ancestry are Chinese (3.9%), Indian (2.4%) and Vietnamese (1.5%).”12 The variety of ancestry is important for strengthening the experiment, since different ancestries have different skull shapes. To begin the experiment, the learning machine had to learn first. Therefore, 900 photos out of 1000 were shown to the machine to learn before Bewes started the experiment. The remaining 100 photos were 50 males and 50 females. Once the machine had learned about the properties of skulls and how to distinguish between the sex based on that, the 100 remaining photos were presented to the machine to determine the sex. The result was admirable. Out of the 50 male photos, the machine got 47 correct, and 48 correct determination out of the 50 female photos with a 95% accuracy percentage.13 We can notice from the experiment that the result was much more accurate than the Wärmländer experiment. This is because of the different types of Artificial Intelligence used in each experiment. In addition, Berwes astonishingly said, “There is no upper limit for the volume of input data into a neural network—a neural network could potentially train on millions of images of skeletal remains, which is far more than a forensic anthropologist is capable of analysing in an entire career.”14 We can say that Artificial Intelligence can work much more than humans with constant accuracy.
Artificial Intelligence is also useful in collecting digital evidence. Digital forensics is not less important than other fields. Digital forensic scientists are responsible for collecting digital pieces of evidence from a computer or a network. Those kinds of evidence are left from common crimes, such as “child exploitation, forgery of documents, tax frauds, and even terrorism.”15 In addition, the number of digital cases has increased, due to the growth of technology. The FBI has calculated the amount of examined digital data from the years 2007 to 2011. In 2007, the number was 1228 terabytes, and it grew to 4263 terabytes of processed data by 2011.16 This article shows the numbers of processed digital data up to 2011, but technology is constantly growing and the amount of digital evidence will continue to grow. The Computer Science Department of Brasilia University cooperated with the Brazilian Federal Police to find a solution to the huge amount of digital evidence they were processing. Hoelz and his team used a different kind of Artificial Intelligence in their experiment, which they called MADIK. This AI machine learning consisted of six agents, or AI-tasked algorithms. Each agent had a specific focus. In their test of these agents, the machine used only four of the six agents. HashSetAgent was responsible for determining whether a digital file was related to the crime or not. Then FilePathAgent, trained to detect the kinds of files criminals usually use, detected those file in the examined device. Then FileSignatureAgent searched the beginning of every file to determine whether the file is related to the specific fraud crime of the test. And lastly, TimelineAgent searched the history of the browsers and the dates of each site visited. All these agents or programs make up the whole system of MADIK. The goal of the test was to have MADIK investigate a computer that had a 75 GB capacity that contained evidence related to a particular fraud case. Each of the MADIK’s four Artificial Intelligence agents did well at performing its job. HashSetAgent suggested the deletion of 265,519 files (69.8%) of the total files on the device, for being duplicated and empty files. FilePathAgent advised the removal of 24,182 files (9.8%) of the total files, for being gifs, programs, cookies, or Microsoft Office files. After the machine learning finished, each human agent was able to examine 85% of the remaining files in two hours. If they had to cover the same amount of data without the aid of MADIC, it would have taken two examiners 24 hours.17 This result alone demonstrates the huge advantages that Artificial Intelligence holds out for forensic science, as it works fast without becoming exhausted.
Any technique or method has advantages and disadvantages, and Artificial Intelligence is no exception to this rule. Many false facts about Artificial Intelligence lead many to be concerned and terrified when they hear about it. Artificial Intelligence, though, is a tool that can and should be used to aid the efforts of forensic science, just as other tools are used in forensic science. What are the real risks to using Artificial Intelligence? Some fear that machine learning can lead to the machine learning incorrectly. Artificial Intelligence needs time and experience to differentiate its output (i.e. male or female, true or false, etc.). The outputs are adjusted by the designer and the method of classifying the outputs is learned by the machine, except if the designer restricts the method of learning. Therefore, a problem might occur if a machine classifies its output based on features that should not be used as inputs. To clarify the point, an experiment by Ribeiro, where a machine called “Google’s pre-trained Inception neural network,” was used to identify whether an animal in a photo is a husky or a wolf. First, the machine was given twenty photos mixed of both huskies and wolves to identify. After the machine finished its learning, it observed another sixty photos to classify whether the animal in the photo is a wolf or a husky. The result was that all the photos that had snow in the background were classified as wolves, and if there was no snow, the photo was classified as huskies.18 The machine classified the photos based almost solely on the backgrounds, which is a false method for distinguishing between huskies and wolves. This experiment shows one way in which an Artificial Intelligence could be dangerous and gives false results based on wrong information. Therefore, Artificial Intelligence is only as smart as the programmer who sets up the algorithms it is tasked to work with, so we have to be careful using AI. Also, the adjustment should be done by an expert and the result should be carefully observed to prevent any kind of misleading result.
In conclusion, Artificial Intelligence aids the judgment and methods of forensic scientists. Artificial Intelligence usage is a tool for all those involved in the various forensic fields. From determining the skeletal sex in forensic anthropology to reducing the amount of data in devices that are forensically examined are but two examples that illustrate how beneficial AI can be. We should not ignore possible problems in the use of Artificial Intelligence, but we can overcome these problems and break the fear of many toward AI for better forensic science progress. Once the fear of using AI breaks, the fields in forensic science will undoubtedly benefit.
This article would not be as it is now without the help of those that I really appreciate. I would like to thank Dr. Peter Platteborze for instructing me in the early stage of writing the article. His experience was so helpful and he was so willing to help. Also, I appreciate the support that I received from Christopher Hohman. He has helped a lot with creative ideas. Finally, I am so grateful to Dr. Bradford Whitener. I have learned a lot from the advice and instruction he gave me.
- Miguel Paredes, “Can Artiﬁcial Intelligence Help Reduce Human Medical Errors? Two Examples from ICUs in the US and Peru,” Tech Policy Institute (February 19, 2018): 9-10, https://techpolicyinstitute.org/wp-content/uploads/2018/02/Paredes-Can-Artificial-Intelligence-help-reduce-human-medical-errors-DRAFT.pdf ↵
- “Artificial Intelligence: Societal and Ethical Implications | House Committee on Science, Space and Technology,” accessed April 5, 2021. ↵
- James Bewes et al., “Artificial Intelligence for Sex Determination of Skeletal Remains: Application of a Deep Learning Artificial Neural Network to Human Skulls,” Journal of Forensic and Legal Medicine 62 (February 1, 2019): 40–43. ↵
- David L. Poole and Alan K. Mackworth, Artificial Intelligence : Foundations of Computational Agents (New York: Cambridge University Press, 2010), 3-4. ↵
- David Silver et al., “Mastering Chess and Shogi by Self-Play with a General Reinforcement Learning Algorithm,” ArXiv:1712.01815 (Cs) December 5, 2017, 1–2. ↵
- Carolina Crisci, Badih Ghattas, and Gonzalo Perera, “A Review of Supervised Machine Learning Algorithms and Their Applications to Ecological Data,” Ecological Modelling 240 (August 1, 2012): 114. ↵
- Douglas H. Ubelaker, “The Forensic Evaluation of Burned Skeletal Remains: A Synthesis,” Forensic Science International 183, no. 1 (January 1, 2009): 3. ↵
- Kazuhiko Imaizumi, “Forensic Investigation of Burnt Human Remains,” Research and Reports in Forensic Medical Science 5 (December 10, 2015): 69. ↵
- Susan Essien Etok, et al., “Structural and chemical changes of thermally treated bone apatite,” Journal of Materials Science 42, no. 23 (2007): 9807. ↵
- Sebastian K. T. S. Wärmländer et al., “Estimating the Temperature of Heat-Exposed Bone via Machine Learning Analysis of SCI Color Values: A Pilot Study,” Journal of Forensic Sciences 64, no. 1 (2019): 5. ↵
- Sebastian K. T. S. Wärmländer et al., “Estimating the Temperature of Heat-Exposed Bone via Machine Learning Analysis of SCI Color Values: A Pilot Study,” Journal of Forensic Sciences 64, no. 1 (2019): 1-4. ↵
- James Bewes et al., “Artificial Intelligence for Sex Determination of Skeletal Remains: Application of a Deep Learning Artificial Neural Network to Human Skulls,” Journal of Forensic and Legal Medicine 62 (February 1, 2019): 41. ↵
- James Bewes et al., “Artificial Intelligence for Sex Determination of Skeletal Remains: Application of a Deep Learning Artificial Neural Network to Human Skulls,” Journal of Forensic and Legal Medicine 62 (February 1, 2019): 40-42. ↵
- James Bewes et al., “Artificial Intelligence for Sex Determination of Skeletal Remains: Application of a Deep Learning Artificial Neural Network to Human Skulls,” Journal of Forensic and Legal Medicine 62 (February 1, 2019): 43. ↵
- Bruno W.P. Hoelz, Célia Ghedini Ralha, and Rajiv Geeverghese, “Artificial intelligence applied to computer forensics,” in Proceedings of the 2009 ACM symposium on Applied Computing, (March 2009): 1, https://doi.org/10.1145/1529282.1529471. ↵
- Alastair Irons, and Harjinder Singh Lallie, “Digital forensics to intelligent forensics,” Future Internet 6, no. 3 (2014): 586. ↵
- Bruno W.P. Hoelz, Célia Ghedini Ralha, and Rajiv Geeverghese, “Artificial intelligence applied to computer forensics,” in Proceedings of the 2009 ACM symposium on Applied Computing, (March 2009): 5, https://doi.org/10.1145/1529282.1529471. ↵
- Marco Ribeiro, Sameer Singh, and Carlos Guestrin, “‘Why Should I Trust You?’: Explaining the Predictions of Any Classifier,” in Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Demonstrations (San Diego, California: Association for Computational Linguistics, 2016), 1142–1143. ↵