The LLM misinformation problem I was not expecting

Image partially generated with AI using the prompt "AI LLM misinformation".

The prolific use of Artificial Intelligence (AI) Large Language Models (LLMs) presents new challenges we must address and new questions we must answer. For instance, what do we do when AI is wrong? I teach two Master’s-level courses at Georgetown University, and as such, I’ve received guidance on how the program allows the use of tools like ChatGPT and Bard.

I expected to see students use AI and LLMs without properly validating generated content or providing attribution to the content sources. In one instance, students submitted oddly similar submissions that may have started in part or in full from AI LLMs. In that particular case, however, they sought supporting materials like the use of an Internet search engine. Then the autumn 2023 semester began, and a new pattern emerged.

A trend of non-vetted content

Not long into the autumn 2023 semester, students began to cite blogs and vendor materials that made sense but were partly or entirely incorrect. This problem traces back to LLMs providing ‘hallucinations’. In some cases, vendor content creators incorporate these untrue materials directly into their published content without vetting or correcting them.

It wasn’t an infrequent problem during the autumn 2023 semester. In the past four years of teaching three semesters a year, I encountered just one activity where several students found incorrect information as the result of a high search result. During the fall 2023 semester, however, I noticed the problem on at least three separate assignments. In one case, the information was put together so well in the source materials that it caught me off guard. I had to validate my own thoughts with others to confirm!

Let’s take a look at a couple of examples to better understand what’s going on.

Misidentifying AI libraries/software as operating systems

In one example, I saw students reference descriptions of what might be AI-related libraries or software as operating systems. In a recent module on operating systems, for instance, students enthusiastically described ‘artificial intelligence operating systems (AI OS)’ and even ‘Blockchain OS’. There’s just one issue — there’s no such thing as an AI OS or Blockchain OS.

This content made it online because no one corrected it before publishing it in multiple places online as blog content. Inaccurate descriptions, such as those calling AI libraries or software development kits as operating systems, add confusion when students and even professionals use Internet resources to learn about new developments and technologies. In this case, students needed to learn about the evolution of operating system architecture. Vetted materials were available, but some students veered into their own research and wound up using sources with content that was not accurate. To its credit, the content was very descriptive and convincing — although incorrect.

The issue here is more than just semantics or nuance. This type of content makes it more difficult for students to grasp the purpose of an operating system versus libraries, software development kits, and applications — concepts that are fundamental to system architecture and its security.

False authentication protocols

Another example of non-vetted AI results includes how some online content inaccurately describes authentication, creating misinformation that continues to confuse students. For instance, some AI LLM results describe Lightweight Directory Access Protocol (LDAP) as an authentication type. While it supports password authentication and serves up public key certificates to aid in PKI authentication, LDAP is a directory service. It is not an authentication protocol.

Vetting in education and infosec

This problem I’ve discussed above is likely happening in more fields than security architecture and design. When it comes to validating content in any field, two themes come up consistently:

Author credibility

Is the author recognized for the work, topic cited, or somewhat related work?
Is there evidence that other experts have validated the content?
When was the material published? Have the authors applied any updates or corrections?

Source credibility

Do sources support the conclusions?
Are the sources ones you would consider to be trustworthy or known to be vetted?
If standards are referenced, do the materials provided by the standards committee support the language and claims? Are the technical terms consistent with the standards committee?

As a way forward, in consultation with the CIS Marketing and Communications team, we will be adding a marker to blogs to communicate the level of review before publication. For my own post, I’ve reached out to known experts to review them (in one case, I’ve decided to hold one from publication due to an oversight that requires correction). This is more of an allow-list approach toward understanding what content has been vetted rather than expecting AI results to be marked.

As for fellow teachers, you can and should provide guidance on sources known to be reliable within a field of study. This is something I did with my students after detecting the problem. Students should check that sources have vetted their content and that the content creator has the credentials to verify their published content.

The creation of a new best practice

The problems around vetting AI results won’t be going away anytime soon. Educators must make sure students have the proper guidance to guide their research in a field of study and embrace review markers as a best practice. These tools can go through a consensus process to gain acceptance as a new best practice, which could ultimately prove useful for updating and sharing content expediently.

Kathleen Moriarty is a Technology Strategist, CTO, Board Member, Keynote Speaker, Author, CISO, and former IETF Security Area Director. She has more than two decades of experience working on ecosystems, standards, and strategy. Kathleen was CTO at the Center for Internet Security when writing this post.

Adapted from the original at CIS Blog.

Rate this article

Discuss on Hacker News

The views expressed by the authors of this blog are their own and do not necessarily reflect the views of APNIC. Please note a Code of Conduct applies to this blog.