Microsoft recently released a study about the types of jobs most likely to be augmented by AI in the future. It led many people to assume those are the jobs that will soon be replaced by machines, and since “historian” ranked second highest on the list, it raised more than a few eyebrows among historians on social media. But after I tested generative AI tools with some historical facts, it seems like historians shouldn’t worry too much about the robots taking over just yet. At the very least, they shouldn’t be afraid that AI could do their job well.
Presidential movies
What history facts did I test? I’m fascinated by the movies that presidents have watched while in office. So that’s where I started.
It’s an odd hobby, but I’ve been researching the topic since 2012, when I found a list of the movies that Ronald Reagan watched at the White House and Camp David. I was inspired to file a Freedom of Information Act (FOIA) request for the movies that then-president Barack Obama had been watching, but I found out that presidential records are exempt from FOIA until five years after a president leaves office. But that didn’t deter me. I dug into the subject and have been combing through a vast array of sources on presidential movie-watching habits ever since, dating back to Teddy Roosevelt’s first screening of a bird documentary in 1908.
If you’re going to test a generative artificial intelligence tool, you need to test it using something you know really well. When people ask questions of ChatGPT, they’re typically asking about things they don’t know, which makes sense. These are supposed to be tools that help us do things better and faster. And if they worked as advertised, they would be fantastic. The problem is that they often do not work as advertised.
I asked questions that were a mix of things I knew could be easily answered by a simple Google search and other questions that would be more difficult to find in books and archives. And the results might be eye-opening if you’re still trusting AI chatbots for work you care about getting right.
OpenAI’s GPT-5 flopped
I first tried to test OpenAI’s GPT-5, asking questions about what movies various presidents may have watched on specific days. I chose dates from when the White House was occupied by Woodrow Wilson, Dwight Eisenhower, Richard Nixon, Ronald Reagan, George H.W. Bush, Bill Clinton, and George W. Bush. Each time, ChatGPT replied that it could find no record of any of those presidents watching movies on the dates I provided.
Thankfully, ChatGPT didn’t just lie to me, as it’s been known to do, but it failed to answer some pretty basic questions. GPT-5 is now available to all free users, but there’s a lack of transparency about which model it’s using to answer a given question, and it wasn’t clear what was going on under the hood when I asked about specific dates.
OpenAI has gotten a lot of flak since it released GPT-5 last week. CEO Sam Altman promised it would be like having “a legitimate PhD-level expert in anything, any area you need, on demand, that can help you with whatever your goals are.” But the company nixed the ability of users to pick from the old models, breaking all kinds of workflows and making hardcore users angry. Altman has since backtracked, and ChatGPT now offers access to 4o for subscribers. But my tests have not been encouraging if you want answers to unique questions without a pricey subscription.
There are some people who claim that the only thing keeping CEOs from replacing human workers with AI at scale is some kind of political calculation. They say executives don’t want the bad publicity that comes with mass layoffs. But that explanation simply doesn’t ring true. These tools still need human babysitters because they get so many things wrong so frequently. And my tests with other major AI chatbots like Google Gemini, Microsoft Copilot, Perplexity, and xAI’s Grok demonstrate these tools are far from perfect. CEOs may be willing to settle for “good enough” when it comes to a lot of work. But if you need something that’s accurate, a human needs to be in the loop in many different use cases.
Eisenhower and Grok’s surprising answer
ChatGPT may not know anything, but then I tried my presidential movie questions with Copilot, which allows users to search with various OpenAI models, but also has a “Deep Research” option that can take up to 10 minutes for each search. After several minutes, Copilot will spit out an extensive report at the end. I ran the same questions I tried on GPT-5 on Copilot twice, first with the Quick Response option that uses GPT-4o, then using Deep Research.
The responses from Copilot in Quick Response mode were terrible. I asked what movie President Eisenhower watched on August 11, 1954. Right out of the gate, Copilot said President Eisenhower watched The Unconquered, a documentary about the life of Helen Keller. That’s not true, though the AI noted Eisenhower briefly appears in archival footage during that film, which may be why it gave the wrong answer.
I switched over to Deep Research mode, and Copilot was still wrong, just in a much longer way. In fact, it produced over 3,500 words on the question. Copilot explained that the summer of 1954 saw the release of several notable films that were “likely to have been considered for White House viewing.” The bot listed About Mrs. Leslie, Suddenly, Rear Window, and Living it Up, among others. In Copilot’s analysis, President Eisenhower probably watched Suddenly.
Suddenly is a strange guess because it wasn’t released in theaters until Oct. 7, 1954, several months after the date I asked about, though presidents have sometimes gotten special previews of new films. Copilot seemed to take it for granted that there must have been a movie screening on August 11, 1954, because I asked it the question. But if it didn’t find a listing for the movie in any reliable sources, it’s unclear why it would try to guess.
From Copilot (emphasis mine):
The balance of circumstantial and secondary evidence points to “Suddenly” as the film President Eisenhower watched on August 11, 1954. While a direct, digitized log entry for that screening remains elusive, the convergence of release timing, script inspiration, and theme, together with subsequent archival references, make this the best supported answer.
That’s not “evidence” of any kind, though. The best supported answer is arguably that a movie screening didn’t happen. If the bot couldn’t find a concrete document to support the assertion, it’s very weird to try and just make one fit. As it happens, I know what movie Eisenhower watched because I have a copy of the White House projectionist’s log book from the 1950s, the kind of thing that human historians get a hold of. And Eisenhower watched the movie River of No Return, directed by Otto Preminger and starring Marilyn Monroe and Robert Mitchum, on August 11, 1954.

I asked Gemini the same question about Eisenhower’s movie selection for August 11, 1954, but it didn’t have an answer. And Perplexity also guessed Suddenly, which isn’t correct. But one of the sources cited by Perplexity provides a clue as to why both Copilot and Perplexity might think that’s the answer. The writer of the film, Richard Sale, reportedly got his idea for the Frank Sinatra flick while reading about Eisenhower’s trips to Palm Springs, California, according to Wikipedia. That fun fact really seemed to throw off the robots.
It may surprise you to learn that xAI’s Grok didn’t get the answer on the first try, but after clicking “think harder,” it correctly answered River of No Return. How did it know? The source was my Twitter account, All the Presidents’ Movies, and a tweet from 2019 where I shared it.
President Eisenhower watched the movie River of No Return (1954), starring Robert Mitchum and Marilyn Monroe, on August 11, 1954.
Eisenhower hated Mitchum because the actor was arrested for weed in 1948. It was the only Mitchum movie shown at the White House during Ike's tenure. pic.twitter.com/24W6KOtlnz
— All the Presidents' Movies (@PresidentMovies) August 11, 2019
It makes sense that Grok, which has been trained on all of X’s tweets, would find this one. But you kind of have to just take it on faith that you’re getting the right answer with Grok. This is the Hitler-praising AI, after all. I didn’t cite a source in that tweet, and it’s just a small account I started to toy around with, “this day in history,” fun facts about presidential movies. If, for instance, the account I had started just asserted that Eisenhower had watched the Nazi film Triumph of the Will (1935) on that day, it seems very likely that’s how Grok would’ve responded.
This wasn’t a question I was expecting most bots to answer correctly, given the fact that I don’t believe these log books have been published widely. But it speaks to why historians are needed for stuff like this if you want more than information that’s just synthesized from things that are readily available on the internet. So next I tried a question that was much easier.
Nixon’s obsession with Patton
What movie did Richard Nixon watch on Feb. 12, 1971? The answer is The Great Chase (1962), which Nixon watched at businessman Robert Abplanalp’s home in the Bahamas. That fact is noted in the 2004 book Nixon at the Movies by Mark Feeny, the definitive account of Nixon’s movie-viewing habits. But Copilot’s Quick Response got it wrong.
“On February 12, 1971, President Richard Nixon watched the film Patton at the Key Biscayne compound in Florida,” Copilot claimed, even providing a link to a source. If you click that source—a daily schedule for Nixon on that day held by the National Archives—there’s no movie listed on Feb. 12, 1971. “Nixon was reportedly deeply impressed by the film and even referenced it in later speeches. It’s said to have influenced his thinking on leadership and military strategy. Quite a cinematic choice for a sitting president!” Copilot explained.
There is a businessman named T.F. Patton listed in one of the documents linked to on that National Archives page in an appendix about a business council, but that obviously has nothing to do with a movie screening.
I tried out the Deep Research version of Copilot with the same question. It delivered the correct answer, The Great Chase, ultimately finding the citation in Feeny’s book, but it introduced other false claims in its very long report. For example, the bot claimed that “For many years, both popular myth and some reputable sources have asserted that Nixon watched the film Patton (1970) on February 12, 1971, either at Key Biscayne, his regular retreat, or at the White House.”
That doesn’t appear to be true, at least not about that particular date. Nixon watched Patton several times, but there doesn’t appear to be any evidence that Nixon watched Patton on Feb. 12, 1971. Copilot also claimed that Nixon watched the movie on back-to-back days on April 24 and 25, 1970. But that’s not true either, if you look at the list that’s linked as the source, which is Feeny’s own list. Nixon watched The Cincinnati Kid on April 24, 1970, at Camp David, and then he watched Patton at the White House on April 25, 1970, according to both Feeny and records held by the National Archives.
Grok got the answer right, though its sourcing was opaque for anyone who wanted to find something reputable to cite. The top source was listed as Nixon’s Daily Diary, but The Great Chase isn’t listed there. The seventh source provided was Feeny’s list.
Perplexity insisted that Nixon watched The Good, the Bad and the Ugly on Feb. 12, 1971. The source linked was Feeny’s list, and it seems to have been confused because that’s the movie Nixon watched a year later on Feb. 12, 1972. Gemini got the movie title right but insisted the president watched it in Key Biscayne, Florida. That’s also not true.
Woodrow Wilson watched more than one movie
What movie did Woodrow Wilson watch on March 6, 1917? The answer is The Crisis (1916), a silent movie that I ordered from the Library of Congress and uploaded to YouTube and the Internet Archive because it wasn’t previously available. The movie has never received a home release, and there was no place to watch it online. So I fixed that. The date of the screening is provided in the 2012 book Col. William N. Selig, the Man Who Invented Hollywood by Andrew A. Erish.
Grok didn’t know if a movie had been screened at the White House on that day. Perplexity insisted there was no credible evidence that Wilson watched a movie on that date. Copilot’s Quick Response claimed Wilson watched The Birth of a Nation (1915). The Deep Research answer also said The Birth of a Nation, but insisted there “was some confusion about the exact date of the screening.” There is no confusion. Wilson watched The Birth of a Nation on Feb. 18, 1915, according to countless historical sources. He didn’t watch it on March 6, 1917, the date I asked about. But it’s the most famous movie Wilson watched while in office, so Copilot clearly tried to mold my question into that reality.
ChatGPT falsely claimed that The Birth of a Nation was the first movie ever screened at the White House, which isn’t true. President William Howard Taft and President Teddy Roosevelt didn’t screen many movies, and they were all shorter than The Birth of a Nation (as almost all movies were before 1915), but they happened.
Reagan and Rambo
If you ask Copilot which movie Ronald Reagan watched on June 15, 1985, the answer it will spit back in Quick Response mode is Rambo: First Blood Part II (1985), and that’s not correct. But you can check the source and figure out why. Copilot seemed to be confused by a sentence at a website called bestofdate.com that describes June 15, 1985, in that clichéd way we all know: “Ronald Reagan is the President of the United States, and the movie Rambo: First Blood Part II is at the top of the box office.” That obviously doesn’t mean Reagan watched the Rambo sequel on that date.
The correct answer is The Lion in Winter, which President Reagan watched at Camp David, according to the Reagan Library. If you ask Copilot to do Deep Research on the question, it eventually gives the right answer. But in an effort to provide more context and be robust, it also creates a table that includes many errors.

As you can see above, the heading promises other movies watched by Reagan in June 1985. The first film listed is Alfred Hitchcock’s Topaz, watched June 1, 1985, at the White House with a note that says “No weekend at Camp David.” But as the Reagan Library’s list makes clear, the president and Nancy Reagan watched that at Camp David. Why did Copilot trip up? Because it looks like the heading for the White House Daily Diary is wrong. As you can see for June 1, 1985, the official record lists the president’s location as the White House. So which one is right? This is where a human historian finds other sources to break the deadlock in a primary source document that appears to have an error.
If we look at the 2007 book The Reagan Diaries, it notes that the president helicoptered to Camp David on May 31 and left Camp David for the White House on June 2. That’s also confirmed from the schedule kept by the White House staff, even if the June 1 log lists the president at the White House. He watched Topaz at Camp David.
The list generated by Copilot also claimed that Reagan didn’t watch a movie on June 8, 1985, and spent the day at the White House, which isn’t true. Reagan watched the 1971 movie Big Jake starring John Wayne at Camp David. The list claims Reagan watched The Natural (1984) on June 22, 1985, which is also a lie. He actually watched MacKenna’s Gold (1969). Copilot also claimed Reagan watched the movie Witness (1985) on June 29, 1985. The president watched Witness in February of that year, and he doesn’t appear to have watched any movie on June 29. You can see how these kinds of inaccuracies would be a problem for anyone trying to do serious research.
Grok, Perplexity, and Gemini all got the answer to this one right.
Bush Goes Old School
What movie did President George H.W. Bush watch on August 8, 1989? The answer, according to the Bush Presidential Library in Texas, is the 1942 World War II classic Mrs. Miniver. But ChatGPT didn’t know that, Grok didn’t know, and Copilot in Quick Response didn’t know. Gemini 2.5 Pro? Nope. Perplexity? That AI said Bush watched the movie Batman that day, which isn’t true. If you click on Perplexity’s source for that fun fact, it takes you to a Wikipedia list of the presidents by age. It’s unclear why it would do that since Batman isn’t listed anywhere in such a list.
Copilot in Deep Research mode acknowledged it didn’t know and gave a list of movies that were released around the time, including Batman, which it characterized as “likely options for a White House screening based on box office success, presidential taste, and prevailing film culture.” But Mrs. Miniver obviously wasn’t on the list since it was released in the 1940s. Presidents have often watched older movies, typically for nostalgic reasons, which means a random list of movies released around the date you’re asking about isn’t very helpful.
Clinton and the Mystery Warrior
I moved on to a question about President Bill Clinton. What movie did he watch on September 4, 1999? The correct answer was the comedy Mystery Men. Perplexity and Grok got it right.
Gemini said Clinton watched a movie called The 13th Warrior at Camp David. The chatbot claims the source for that is the “official daily schedule from the Clinton Presidential Library.” But I submitted a FOIA request to the Clinton Library years ago to get a list of every movie screening during Clinton’s time in office and even wrote an article about it in 2016. Clinton watched the Ben Still movie Mystery Men. Gemini appears to have scraped the movie title The 13th Warrior, a movie starring Antonio Banderas, from some Clinton-adjacent papers held by the National Archives. It looks to be some email forward about weekend box office receipts from the Christian Science Monitor. Why did Gemini decide that such a document meant Clinton watched The 13th Warrior on September 4, 1999? Your guess is as good as mine.
Copilot’s Quick Response said Clinton watched Notting Hill, which isn’t right either. But after waiting, Deep Research came up with the right answer. The rest of Copilot’s several thousand words clearly leaned heavily on the original research I’d published online.
Bush and the Twin Towers
I asked all of the various chatbots about George W. Bush and what movie he watched on September 10, 2003. The answer is the short documentary Twin Towers. Or at least that’s what I thought it was until I ran the question through various chatbots.
Perplexity and Grok said it was Twin Towers, both citing FOIA documents I had requested in 2015 that the Bush Library had posted online. Copilot’s Quick Response said DC 9/11: Time of Crisis a made-for-TV movie about Bush’s response to the 9/11 attacks, which aired on Showtime. Copilot’s Deep Research gave the same incorrect answer.
From Copilot:
The documentary “Twin Towers” (an Oscar-winning short) was also circulating during this period, sometimes shown alongside or in proximity to larger dramatic works, but in the context of the White House viewing, the focus was on the dramatized retelling provided by “DC 9/11.”
I’ve found no evidence that a docudrama, which only aired on Showtime, was screened at the White House. They watched the documentary.
Test it yourself
My tests aren’t scientific. But they weren’t designed to be. The AI companies will publish various benchmark tests and insist their improved bot is now this much better at reasoning or that much better at refusing to hallucinate, as they call it. But not only did OpenAI get caught sharing an absurd graph last week, the only test that really matters when you’re using a new tool is how it works for your specific use case. And the only way to test that properly is to try it out for yourself using information that you know well. It’s a boring thing to do, but it’s the only way you can quickly gauge whether it knows the things you know.
Generative AI tools like ChatGPT are being sold as all-purpose tools that can answer any question you throw at them on any subject in the world. And that’s obviously a very tall order. Sam Altman used to talk about artificial general intelligence (AGI) as something that was just over the horizon. And while he’s still arguably overhyping the general-purpose use cases of AI, it’s notable that Altman recently said it’s no longer a “useful term” because nobody can agree on what it means.
My tests were also incredibly narrow when it comes to what a historian actually does. It’s not the job of a historian to merely collect all the facts that have already been published and rearrange them in new ways. New historical research relies on the research that has come before it, but good historians are always adding something new. They find things that are hard to find in archives, and they conduct interviews with experts or first-hand witnesses to historical events. They contribute something beyond a repetition of the things people have already published. My test was just about dates. But even that is a tiny sliver of what historians contribute to our understanding of the past.
If you test out AI for yourself, it will get lots of things right. Millions of people find it very useful for many tasks. But every once in a while, it’s good to ask your robot some things you know really well, just to remind yourself that this tool doesn’t know everything. Because too many people have been lulled into thinking it’s a god. And when we get too far down that path, not only do we all get dumber, but people start to lose their minds.