5
$\begingroup$

I would like to download all my answers in MSE and save them into a LaTeX script or a pdf file.

This answer 1 provides a method to download all the answers in CSV file (not LaTeX script or pdf file).

This answer 2 seems helpful. However, the tool provided in the answer does not work anymore.

I wonder whether

  • there exists a tool that takes as input the CSV file (obtained from the method in the answer 1) and create a LaTeX script
  • or there exists a more direct method that give me the LaTeX script of all my answers?
$\endgroup$

1 Answer 1

5
$\begingroup$

Not an answer but I wanted to post some thoughts that a solution would need to consider, if someone wants to have a go...

  1. I downloaded all my posts from answer 1 and got a 3MB .csv file. I've never opened a 3MB TeX file before, does it work? If not then perhaps need to sort and batch.
  2. MathJax is not $\LaTeX$: There are certain constructs that are allowed in MathJax but not in $\LaTeX$. For example $$ \begin{align} ... \end{align}$$. Another is $\sum_{\text{newlines} \\ \text{in subscripts}}$ $\sum_{\text{newlines} \\ \text{in subscripts}}$ (this requires \substack in LaTeX) and similarly $\text{new lines} \\ \text{in math mode}$, $$\text{even}\\\text{in display mode.}$$ Those throw errors in $\LaTeX$. Another one is that \color works differently here: ${\color{red}red}$${\color{red}red}$ works in $\LaTeX$. But $\color{red}{red}$$\color{red}{red}$ works in both so I think this isn't an issue for the MathJax $\to \LaTeX$ direction.
  3. MathJax on Math.SE doesn't have an explicit list of packages; one will need to check what packages are needed in the $\LaTeX$ preamble.
  4. \newcommand, \renewcommand and \let across different questions/answers may conflict in the resulting .tex so you will need to e.g. add some \let\...\relax at the end of posts.
  5. Posts on Math.SE are not written with only MathJax: Links and images are written in Markdown. In particular images are not included in the csv file download. Markdown Tables might be a bother to format correctly.
  6. Actually, looking at the .csv file, it seems to not have exactly the source I wrote, but some HTML. An example from my .csv file below. So that will also need to be parsed out.
<p><span class=""math-container"">$$C_1(1+||x||^2)^k \leq \sum_{|\alpha| \leq k} x^{2\alpha} \leq C_2 (1+||x||^2)^k,$$</span></p>

In addition some questions' comments are IMO vital to understanding the post, so that's a good nice-to-have.

I think the easiest would probably to make a script that visits each of the questions and just prints the page to PDF. (Don't do too many pages at once if you don't want to be banned from SE) This will give you PDF files but no .tex file.

If you need a .tex file you will probably want to go from the csv file, TexSoup looks like it might be useful.

$\endgroup$
6
  • $\begingroup$ Thank you very much for your answer. I think the method "creating Latex script or pdf file from csv data" seems complicated. A direct method as you suggested "make a script that visits each of the questions and just prints the page to PDF" seems doable. Besides, I'm pretty sure that several months ago, I read a post (not sure in meta.math stackexchange) where the author said he sucessfully printed all his answers from MSE to a pdf file of hundred pages (or a tex file) and if my memory is good, his method didn't use the csv data. I tried to search for this post without success. $\endgroup$
    – NN2
    Commented Jul 7 at 11:00
  • $\begingroup$ @NN2 I believe I saw a similar/different post of someone selling their Math.SE posts as a book on Amazon. Obviously that means s/he downloaded his/her answers and put it in a printable format... $\endgroup$ Commented Jul 7 at 11:27
  • 2
    $\begingroup$ @NN2 this also doesn't answer the question since it creates markdown files but its definitely a good start: math.meta.stackexchange.com/a/36221/80734 this is a python script that downloads all the questions and its answers. It's not perfect but much better than nothing. I just tried it and it works $\endgroup$ Commented Jul 7 at 12:25
  • $\begingroup$ Thank you for the suggested link. I'll try it. $\endgroup$
    – NN2
    Commented Jul 7 at 21:21
  • $\begingroup$ One approach to painlessly circumventing the mathjax is not latex difficulty is by using StackEdit to review the downloaded file. An alternative (more painful) approach, is to attack the issue programmatically, identifying any potentially needed editing, and writing a program in some language like C or Java to handle the grunt work. If you go the painful route, then I advise against piggybacking off of someone else's computer code. In the long run, you will have less pain if you micromanage the situation. ...see next comment $\endgroup$ Commented Jul 9 at 4:18
  • $\begingroup$ There are many obscure gotcha's that will surface, one at a time. If you are micromanaging, then it becomes easy to squash each gotcha, as it surfaces. $\endgroup$ Commented Jul 9 at 4:22

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged .