Oh wow, funny this should turn up here. I had of course read the story in Men of Mathematics, and when I saw this question posted on MathOverflow in 2015, I decided the existing answers were inadequate (for me at least), so I went super deep on this question[0], reading some obscure papers in German and Latin via typing them up into Google Translate, but never got around to writing up the definitive answer to this. Let me post an outline of my findings here, as it's unlikely I'll write that post anyway:
• The account by E. T. Bell, of Cole standing up and wordlessly carrying out a multiplication on the blackboard, ending with applause, is, in my opinion after considering all the facts available, almost certainly a "dramatization" by Bell.[1] However, I'm inclined to believe that Cole basically followed the structure of his paper[2], which does have a dramatic end and there would have been applause (even if his talk was not wordless). Also, the "three years of Sundays" seems about right.
• Unlike the implication of Bell's story, M_67 = 2^67 - 1 was already known to be composite by 1903, as Lucas[3] had already shown it (and Cole cites him in his paper). What was not known was a factorization, which is what Cole did.
• Cole's paper is very elliptically written and only focuses on his original contribution, but if you read it carefully for the factorization method, the true hero of this story IMO is Paul Peter Heinrich Seelhoff (1829–1896). This Seelhoff (does not have an article on Wikipedia, even on the German one) was a high-school teacher in Mannheim, and he wrote some papers that have many great ideas but are also full of computation errors. No one seems to have paid his work much attention before Cole (his name shows up in only very few histories of mathematics / number theory from that period).
• Seelhoff's method, which Cole used, is basically what we now call the Quadratic Sieve[4], “still the fastest for integers under 100 decimal digits or so”. It's amazing how many modern algorithmic ideas Seelhoff anticipated, even things like optimization tweaks and when to stop. (After Cole only Kraitchik seems to have read Seelhoff closely, and it's via Kraitchik that the modern methods like CFRAC and quadratic sieve came to be, after the advent of computers.)
• The idea at heart is a generalization of Fermat's factorization method, and goes back to Lagrange (and Gauss): to factor N, if you can find a solution to x^2 ≡ y^2 (mod N), where x ≢ ±y (mod N), then you're done, because N divides (x-y)(x+y) and taking gcd(N, x-y) and/or gcd(N, x+y) will give a nontrivial factor.
• If you have some magical method that tends to give small quadratic residues mod N, that will help, in two ways: the very fact that (say) 7 is a quadratic residue mod N (and therefore mod its residues) will impose some condition (by quadratic reciprocity) on the prime factors of N, which will knock out half the possible values. So with k quadratic residues you need to try only 1/2^k of the primes. The second way it helps (due to Seelhoff) is that your method may end up giving you two ways the same quadratic residue can be achieved, i.e. a nontrivial solution to x^2 ≡ y^2 (mod N).
• The idea of the quadratic sieve (and Seelhoff's innovation) is to find such residues by doing the following: first, try to find "smooth" residues mod N, by looking for conditions on x to make a=x^2-N be divisible by several chosen small primes. Then, whichever of these these residues you can fully factorize, write it as (basically) a bit vector of the primes that divide it. Keep track of these bit vectors (write them down as rows one below the other, say) and do row reduction: the moment you get a "cycle" (nontrivial linear relation), you're likely done, as you have x^2 = y^2 (mod N).
• Cole did come up with an idea or two of his own, which cut the search space down (in the third phase) by a factor of 4 (instead of 2) for each residue: thus, estimating that it was doable, he had the persistence to carry out this project of hand calculation. But the bulk of this calculation was finding quadratic residues and using the resulting congruence conditions, not trial division.
I'll stop here as I have the sense this comment has become super long, but you can get a good idea of all this in Factoring Integers Before Computers (1994) by H. C. Williams and J. O. Shallit (cited in the page below).
[1]: I reach this conclusion from a careful reading of E. T. Bell's biography by Constance Reid, which BTW is an amazing book. The first part of it reads like a detective story (her own work unearthing details about Bell). Also, at one point there's a cameo appearance by Arthur C. Clarke, as a fellow science-fiction writer (E. T. Bell wrote some SF novels as John Taine) who happened to live in Sri Lanka. BTW Bell was almost certainly not present at that New York meeting of the AMS where Cole read his paper. Also, Cole was likely Bell's PhD advisor later (surprising we can't be sure of advisor-advisee relationships this late).
[3]: Lucas is another hero; you can read H. C. Williams's book about him. He wrote papers on agriculture etc and randomly turned to math and did an amazing amount of work (on factoring etc — and it was he who made Fibonacci numbers a worthy object of study and gave them that name) in just a few short years, before abruptly abandoning math completely after someone gave a bad review to one of his papers.
The brilliance of Édouard Lucas: see "Lucas primality test" (for general N, when you can factor N-1) and "Lucas–Lehmer primality test" (for Mersenne numbers N). Both work for M_67. There's a nice account somewhere of how Lucas carried out the latter test with a 127x127 chessboard to prove that M_127 is prime. For M_67, Lucas had done the calculations and declared it composite, and there was confirmation from another author. And Cole seems to have trusted this too. (I mention this because Shallit and Williams point out that on some occasions Lucas seems to have wavered in his confidence, for some reason….)
A Fermat test with base 3 shows the number to be composite. I think that could be done in a day or so of hand calculation. That amounts to a brute force method and as svat says, there are better ways. The Fermat test only shows that it can be done without much subtlety.
I've already left it as a comment (actually, three comments!) on the OP's MathOverflow (self-)answer. I'd like to be a little more sure that I've understood everything perfectly (e.g. implement the algorithm myself), and put it in a more coherent form, before posting as an answer on MathOverflow. :-)
Being sure you give a correct & coherent answer is important. You mention "as it's unlikely I'll write that post anyway", so I'd just hate if perfect were the enemy of the good and the answer was never written.
Maybe write it now and revise it as your understanding grows? Also, posting it now might encourage others to evaluate it for you, doing the work you may never get to.
Just a suggestion. Thanks for your deep dive into this and writing it up in the HN comment! Very insightful.
This is mentioned in one of the William Dunham books, how Cole was presenting at a conference and silently multiplied two by itself 67 times, subtracted one, and then multiplied the two factors he'd found and got the same number. And apparently received a standing ovation. Until then, it was not known whether it was prime (it's of the form 2^p-1 with p prime which iirc are potential Mersenne primes)
Yeah this story originates from E. T. Bell, who I'm now fairly convinced just… made it up. (Possibly based on a kernel of truth about the conference presentation involving a lot of calculation, and ending in applause.) He is known for doing that sort of thing.
(And Mersenne's list was already known to be inaccurate because of missing M_61 which is prime, and this number M_67 had been independently declared composite by Fauqembergue in 1894 and by Lucas, so was already believed to be composite.)
E. T. Bell also has an amusing story about Condorcet. Condorcet was a French aristocrat and during the Revolution had to disguise himself as a commoner to escape France. However, when he was stopping at an inn, the innkeeper asked him how many eggs he would like in his omelette. An an aristocrat, he had no idea how many eggs went into an omelette so he said a dozen, thereby outing himself and ending up in prison.
But as far as I can tell, Bell just made that story up, too.
The Britannica biography says 'On the evening of the 7th April 1794 — not, as Carlyle says, on a " bleared May morning,"'. Using Carlyle" and "bleared May morning" as search terms finds it comes from "The French Revolution" by Thomas Carlyle, first published in 1837.
> Condorcet has lurked deep, these many months; Argus-eyes watching and searching for him. His concealment is become dangerous to others and himself; he has to fly again, to skulk, round Paris, in thickets and stone-quarries. And so at the Village of Clamars, one bleared May morning, there enters a Figure, ragged, rough-bearded, hunger-stricken; asks breakfast in the tavern there. Suspect, by the look of him! "Servant out of place, sayest thou?" Committee-President of Forty-Sous finds a Latin Horace on him: "Art thou not one of those Ci-devants that were wont to keep servants? Suspect!" He is haled forthwith, breakfast unfinished, towards Bourg-la-Reine, on foot: he faints with exhaustion; is set on a peasant's horse; is flung into his damp prison-cell: on the morrow, recollecting him, you enter; Condorcet lies dead on the floor. They die fast, and disappear: the Notabilities of France disappear, one after one, like lights in a Theatre, which you are snuffing out.
In elementary school having just learned exponentiation several friends of mine and I challenged each other to see who could find 2^64 the fastest. Two among the group independently discovered the classic exponentiation-by-squaring method, even though none of us could prove it right. Cole definitely would not have multiplied by two repeatedly.
For some reason, the first method that came to mind for me was to find 2^60 as (1000 + 24)^6 (or at least I thought it would be until I realized finding powers of 24 wasn't that easy) and then double that seven times.
I would guess that Cole found 2^67 by multiple methods, because obviously it would be tragic if it turned out he was factoring the wrong number.
The existence of mechanical computers as given in another answer.
The use of language in the linked paper (1). A mix of formal and informal. After all there is nothing to prove,
just an explanation of how he did it. Examples:
> The conclusion was inevitable
> It is well known that every factor of 2^n — 1, p being a prime,
is of the form…
They won't. The algorithm he used was basically just a quadratic sieve which can be quickly run on a computer (i.e. seconds). Simple algorithms (polard rho eg) can factor numbers of this size in fractions of a second. Humans are hopelessly outclassed here.
• The account by E. T. Bell, of Cole standing up and wordlessly carrying out a multiplication on the blackboard, ending with applause, is, in my opinion after considering all the facts available, almost certainly a "dramatization" by Bell.[1] However, I'm inclined to believe that Cole basically followed the structure of his paper[2], which does have a dramatic end and there would have been applause (even if his talk was not wordless). Also, the "three years of Sundays" seems about right.
• Unlike the implication of Bell's story, M_67 = 2^67 - 1 was already known to be composite by 1903, as Lucas[3] had already shown it (and Cole cites him in his paper). What was not known was a factorization, which is what Cole did.
• Cole's paper is very elliptically written and only focuses on his original contribution, but if you read it carefully for the factorization method, the true hero of this story IMO is Paul Peter Heinrich Seelhoff (1829–1896). This Seelhoff (does not have an article on Wikipedia, even on the German one) was a high-school teacher in Mannheim, and he wrote some papers that have many great ideas but are also full of computation errors. No one seems to have paid his work much attention before Cole (his name shows up in only very few histories of mathematics / number theory from that period).
• Seelhoff's method, which Cole used, is basically what we now call the Quadratic Sieve[4], “still the fastest for integers under 100 decimal digits or so”. It's amazing how many modern algorithmic ideas Seelhoff anticipated, even things like optimization tweaks and when to stop. (After Cole only Kraitchik seems to have read Seelhoff closely, and it's via Kraitchik that the modern methods like CFRAC and quadratic sieve came to be, after the advent of computers.)
• The idea at heart is a generalization of Fermat's factorization method, and goes back to Lagrange (and Gauss): to factor N, if you can find a solution to x^2 ≡ y^2 (mod N), where x ≢ ±y (mod N), then you're done, because N divides (x-y)(x+y) and taking gcd(N, x-y) and/or gcd(N, x+y) will give a nontrivial factor.
• If you have some magical method that tends to give small quadratic residues mod N, that will help, in two ways: the very fact that (say) 7 is a quadratic residue mod N (and therefore mod its residues) will impose some condition (by quadratic reciprocity) on the prime factors of N, which will knock out half the possible values. So with k quadratic residues you need to try only 1/2^k of the primes. The second way it helps (due to Seelhoff) is that your method may end up giving you two ways the same quadratic residue can be achieved, i.e. a nontrivial solution to x^2 ≡ y^2 (mod N).
• The idea of the quadratic sieve (and Seelhoff's innovation) is to find such residues by doing the following: first, try to find "smooth" residues mod N, by looking for conditions on x to make a=x^2-N be divisible by several chosen small primes. Then, whichever of these these residues you can fully factorize, write it as (basically) a bit vector of the primes that divide it. Keep track of these bit vectors (write them down as rows one below the other, say) and do row reduction: the moment you get a "cycle" (nontrivial linear relation), you're likely done, as you have x^2 = y^2 (mod N).
• Cole did come up with an idea or two of his own, which cut the search space down (in the third phase) by a factor of 4 (instead of 2) for each residue: thus, estimating that it was doable, he had the persistence to carry out this project of hand calculation. But the bulk of this calculation was finding quadratic residues and using the resulting congruence conditions, not trial division.
I'll stop here as I have the sense this comment has become super long, but you can get a good idea of all this in Factoring Integers Before Computers (1994) by H. C. Williams and J. O. Shallit (cited in the page below).
———
[0]: https://shreevatsa.github.io/site/cole.html [see my comments on https://mathoverflow.net/a/207354 — 2016, 2019, 2021 :P])
[1]: I reach this conclusion from a careful reading of E. T. Bell's biography by Constance Reid, which BTW is an amazing book. The first part of it reads like a detective story (her own work unearthing details about Bell). Also, at one point there's a cameo appearance by Arthur C. Clarke, as a fellow science-fiction writer (E. T. Bell wrote some SF novels as John Taine) who happened to live in Sri Lanka. BTW Bell was almost certainly not present at that New York meeting of the AMS where Cole read his paper. Also, Cole was likely Bell's PhD advisor later (surprising we can't be sure of advisor-advisee relationships this late).
[2]: http://projecteuclid.org/euclid.bams/1183417760 -> https://projecteuclid.org/journalArticle/Download?urlId=bams...
[3]: Lucas is another hero; you can read H. C. Williams's book about him. He wrote papers on agriculture etc and randomly turned to math and did an amazing amount of work (on factoring etc — and it was he who made Fibonacci numbers a worthy object of study and gave them that name) in just a few short years, before abruptly abandoning math completely after someone gave a bad review to one of his papers.
[4]: https://en.wikipedia.org/wiki/Quadratic_sieve