1.0 Possible Sources for the Cantata Texts
It would be a lie if I said I carefully typed all my original German texts in from Bach's manuscripts! A big one, too!!
Instead, I turned to the Internet for a source of German cantata texts... which is a problem, because whilst there are quite a few around, none of them are especially definitive or authoritative -and if they are apparently definitive, they usually make it difficult to acquire the texts, or do things to them that make re-using them problematic! I assess some of the possible sources in what follows.
For example, the "most authoritative" source I could think of currently would have been Bärenreiter, whose complete full score urtext edition of the cantata full scores are the modern definitive sources for the music (and which I purchased many years ago at eye-watering cost!) Their texts are available online, but that's not an official Bärenreiter website, the texts are all in PDF format (which makes extracting usable text from them practically impossible) and for any given cantata you click on, the resulting PDF will apparently contain the text for several other cantatas (seemingly at random!): turning any of that lot into something usable is more than I was able to manage, anyway.
A non-definitive but quite-good source is, fortunately, available from this US university website, put together by Walter F. Bischof. For the most part, the texts seem pretty accurate, but with a few key drawbacks. Firstly, it seems that Mr. Bischoff was allergic to the esset character (ß). Where the text ought to say "Daß" or "muß", for example, he will always have "Dass" or "muss". This is understandable in one way: the esset really is the double-s in German... and whilst it's been common practice to substitute "ss" for "ß" in modern German, it seems a bit wrong to me to do it in old German texts. Of course, one could easily do a global search-and-replace 'ss' with 'ß' for any cantata text obtained, but that would be wrong, since ß is only used to replace double-s after long vowels and dipthongs, so a 'simple' search-and-replace would produce wrong results.
The other major problem the Bischof texts have is that the web technology used to present them is a bit archaic: everything is in the form of a table, with the first column contain the movement number, instrumentation and voice, and the second column containing the actual cantata text. Additionally, each cantata page has header and footer text, which we're also not interested in. It is impossible, therefore, to acquire only the German cantata text, without all the header, footer and instrumentation details. You can certainly acquire a reasonably-good German text from that website, but it will be missing essets, and will be larded with extraneous information you don't want.
On the plus side, if you want the text for BWV 115, you can be reasonably confident that the URL involved would be something like webdocs.cs.ualberta.ca/~wfb/cantatas/115.html: a predictable URL makes scripting the retrieval of a particular text very easy to do.
1.3 Bach Digital
An excellent digital resource for the texts is available from the Bach Digital project, too. I do not know their degree of authoritativeness, since they don't appear to declare specific sources for their cantata texts, but the project has association with various German universities and the Leipzig Bach archive, so they seem pretty solid. Their cantata texts are nicely presented, with proper use of diacritic marks (so no trouble with unlauts and essets, for example!), and without page headings and footers getting in the way of text selection. One of the longer cantatas, BWV 30, for example, can be accessed from this page and the entire set of 12 movements can be selected, copied and pasted into a text editor without any header/footer material cluttering things up. They don't, in other words, break up long cantata texts into separate 'pages' which each have to be selected and copied individually.
The Bach Digital sources would, therefore, make an excellent source for my German cantata texts, except for the fact that they are not readily downloadable in usable text format. It's all very well being able to use a mouse to select, copy and paste text from a web page, but it would be a lot more convenient to be able to download the text from a website using a tool such as wget or curl. That's to say: you can, of course, issue a command such as:
wget https://www.bach-digital.de/receive/BachDigitalWork_work_00000038?XSL.Style=detail -O 30.txt
...and that will download a cantata text. But the first problem is knowing which cantata text it will be! The URL above seems to mention '00000038', so is that the link to get the text for BWV 38? No, as it happens: it will fetch you the text for BWV 30! Well, does this mean that you should always add 8 to the BWV number to get to the right URL. Maybe if I wanted the text for BWV 153, I should specify the URL '00000161' (i.e., 153+8)? Well, no, unfortunately not. The text for BWV 153 actually uses a URL that contains the number '00000187'! Clearly, they have their own way of organising things at their back end, and it's not for me to pass judgement on it from a position of ignorance -but if you cannot easily predict the URL you should use to get a particular cantata text, then you cannot easily script the acquisition of cantata texts. You're back to the slow process of visiting a URL in a browser and clicking+dragging your mouse to select the text you want.
The other problem with the Bach Digital downloads is that there appears no way to download them in a nice text-only format. The wget command I showed above will get the thing down in text format, but it will have hundreds of HTML codes embedded within it, and clearing them out is non-trivial (though can be scripted relatively easily).
In the end, I decided to source my texts from Bach Digital by manual cut-and-paste, which is slow and non-techy but works. I did, however, cast my eyes over the texts contained within Alfred Dürr's The Cantatas of J. S. Bach: where there was a discrepancy, Dürr won, and the texts were modified to match his. For example, for BWV 61, Bach Digital has this for the start of Movement 2:
Der Heiland ist gekommen,
Hat unser armes Fleisch
Und Blut an sich genommen
Und nimmet uns zu Blutsverwandten an.
But Dürr breaks his lines like this:
Der Heiland ist gekommen,
Hat unser armes Fleisch und Blut
An sich genommen
Und nimmet uns zu Blutsverwandten an.
The different line-content and the capitalisation of 'An' on the start of the third line makes Dürr's text different from Bach Digital's (but, incidentally, agrees with Bischoff's). In my German text, you'll find I break and capitalise the lines as Dürr has them, not as Bach Digital do.
Bach Digital do seem to have a rather 'personal' approach to punctuation and line-breaks, unfortunately: quite a lot of them do not agree with Dürr. Nevertheless, BD has the distinct advantage of being in digital form, whilst Dürr remains strictly paper-based!
The use of Dürr as the 'tie-breaker' is perhaps not particularly justified: I've seen what I believe to be several typographical mishaps in his German text (2006 edition), but he is generally regarded as an authority on the subject of Bach's cantatas, so I defer to his reading of the text and just have to keep my fingers crossed that Oxford University Press knew what they were doing when they typeset it!
Very occasionally, I've had to add extra line breaks that are not in the Dürr 'master' version, simply to make sure my text will fit on the page correctly. You can usually tell when this has happened if the linestarts with a lower-case letter, rather than a more usual upper-case one.