Back in around 2012, I got “into” the Bach cantatas in a big way and soon decided that I’d like to have translations of the original German close to hand for each of them. The good news is that translations are available from many places these days (in addition to whatever is included with your CD booklet, of course). The bad news is that, almost without exception, none of them seemed very good to me!
2.0 The Problem
Now: let’s be honest. The original German dates from the early 18th century for the most part, so it’s going to be fairly formal and archaic at the best of times. In addition, it’s full of Lutheran pietism and deep theology, so it’s never going to be ‘light reading’ either! In short, it’s to be expected that English translations of the cantata texts are going to be awkward to read, convoluted syntactically and not terribly convincing, poetically.
Nevertheless, I do think a useful translation should be at least vaguely comprehensible. To take just one example. Dürr has translations for all the cantatas in his magnum opus, The Cantatas of J.S. Bach (with the translations in the English edition of his book actually being provided by one Richard Jones). For BWV 6, for example, he has this snippet:
Highly praised Son of God,
let it not be unwelcome to you
That now before your throne
We lay down a prayer:
Don’t you find the endless sub-clauses infuriating?! You don’t actually know what the point of this paragraph is until the fourth line! Of course, the translation is trying to match the German, line for line; and German word-ordering inevitably lends itself to drawn-out English phrases and nested sub-clauses. So I don’t think this is a bad translation -just that it’s not terribly poetic, nor very comprehensible, nor really explicatory of the original German -by which I mean, what does “laying down a prayer” mean anyway? Is it the same thing as “offering a prayer”? Even as a metaphor, it’s just not a phrasing that means much, even in English!
Now, if we were going for a free-form re-write into vaguely modern English, I think we might instead do better with:
May it not displease you,
High-praised Son of God,
if we pray to you now
Before your throne:
I think that is more comprehensible and less convoluted. But this won’t really do either, because the more natural English now bears no relationship, line by line, to the original German:
Laß es dir nicht sein entgegen,
Daß wir itzt vor deinem Thron
Eine Bitte niederlegen:
Notice how my ‘modern English’ first line now bears no relation to the German first line: even a non-German speaker can surely see that ‘Hochgelobter Gottessohn’ means ‘High-something God-Son’… which isn’t what ‘May it not displease you’ sounds like at all! Similarly, the German line 3 says something about ‘deinem Thron’ which even a non-German speaker can see says something about ‘your Throne’… and my ‘modern’ translation only gets round to mentioning a throne in its line 4.
Modern English is fine for comprehension, in other words, but useless for ‘synoptic reading’, where the English words kind of match up with the German ones.
Clearly, the Dürr translation matches the line order much better than my ‘modern’ re-working, but (I would argue) my modern-English translation means more intelligibly and says what it says in a more natural way whilst it’s at it. So comes the obvious next question: Is there a middle ground -something that achieves a better match between the German lines and the English, yet is still more comprehensible than the English used in Dürr’s book? Well, I’d hope so and this is why, on my published translations, I’ve instead finally gone for this:
High-praised Son of God,
let it not displease you
if we now, before your throne,
Here, the English in each line matches the German in each line of the original: ‘Son of God’ matches with ‘Gottessohn’ on line 1 in each case, and ‘throne’ matches with ‘Thron’ on line 3 in each case; but (I would claim) the new translation is less stilted than the Dürr original even though line-matching has been retained.
For example, the problem of what “laying down a prayer” means has been resolved, by translating it instead as “petitioning you”. The original German is “eine Bitte”, which literally means “a request ” or “a petition”, together with the word “niederlegen”, which literally means “lay down” or “set down”. So, the Dürr’s “we lay down a prayer” is literal in one way (niederlegen = lay down), but is a bit free in the way it handles the word ‘Bitte’ (which is a plea, petition or request, not literally a prayer). Hence, it seems appropriate to me to simplify a little and say the original German is about ‘laying down a request’ -> ‘setting out a request’ -> making a plea’, which eventually in English gets us to the verb ‘to petition’. My translation thus loses the (slightly odd!) metaphor of ‘laying down’, but retains the sense of ‘requesting’ or ‘pleading’.
I could also have said “let it not displease you if we now … make this request of you…”, but I think that loses the sense of the formality of the original German, and its ecclesiastical setting. These cantatas were written for liturgical performance, so the language needs to be appropriately formal and elevated, not casual and street-wise. In English, we tend not to say ‘we request something of God’, but we do talk about being ‘humble petitioners’. Hence my choice of ‘if we petition you’ rather than ‘if we request of you’. The general sense is pretty much the same in either case, but the formality and ‘churchiness factor’ is slightly higher with the verb ‘petition’ than it is with ‘request’ (or even ‘ask’). “If we plead with you…” would have worked equally well on all these fronts, I think; but ‘pleading’ sounds, in English, a lot more pitiable and desperate than ‘petitioning’, and that extra baggage wasn’t, it seemed to me, called for in this passage.
You get my point, I hope. It’s not that other translations are bad -that the people that made them don’t actually speak German or English very well, for example. Most of them clearly speak both excellently (though there are some hilarious exceptions out there!) and there’s little to fault in their literal translations. No, my problem with all the translations I saw was that they were too literal, or paying no heed to the comprehensibility of the resulting English, or had little regard to the formality or ecclesiastical nature of the language the cantatas need to be rendered into.
In despair at finding translations that I felt were a reasonable balance of the comprehensible, poetic and formal, I decided to produce my own -and these are the translations you’ll find on this website. I don’t claim they’re perfect, but I do think they are more naturally readable than most, whilst preserving the religiously-elevated sense of the original German, and mostly preserving the line-by-line comparability (i.e., a line of my translation should more or less translate the German line exactly opposite it). It is not possible to make fine poetry out of the resulting English (I’m no poet anyway, but I think even Shakespeare would have trouble turning this set of 18th Century Lutheran pietism texts into beautiful poetry!), but it should be at least readable and meaningful.
3.0 The Bible
I want to mention a specific ‘exception’ I made during the translation process: Biblical quotations.
If I ever found Bach’s librettists quoting chunks of the Lutheran Bible, I decided I would quote the equivalent passage from the King James version (KJV) of the English Bible, even if meant that the English didn’t quite accurately translate the German (which happens because Luther’s Bible isn’t identical in content to the King James, regardless of language differences). I used the KJV because it’s formal and antique English automatically gives a heightened sense of the spiritual and godly, thus matching the effect Bach’s listeners would have felt on hearing quotations from Luther’s Bible in the first place. Quoting the Bible also makes the text instantly familiar to most English readers: there’s an ‘ancestral memory’ about this stuff which means you recognise it as soon as you hear the first few words -just as Bach’s listeners would have recognised the passages in question as coming from their Bible.
One example of the consequences of this approach will have to suffice: from Cantata 11, the Ascension Oratorio. In the original German, we have:
Sie aber beteten ihn an,
wandten um gen Jerusalem von dem Berge,
der da heißet der Ölberg,
welcher ist nahe bei Jerusalem
und liegt einen Sabbater-Weg davon
In the third line, we have the words der Ölberg. In German, that is simply the name of ‘The Mount of Olives’, a name which is familiar to most current English speakers, I think. But my translation of this passage reads:
But they prayed to him,
and journeyed unto Jerusalem
from the mount called Olivet,
which is from Jerusalem
a sabbath day’s journey,
And here I’ve mention ‘a mount called Olivet’. Why? Because the passage in question is a part-quotation from Acts 1, verse 12, which in the King James translation reads:
Then returned they unto Jerusalem from the mount called Olivet, which is from Jerusalem a sabbath day’s journey.
In other words, the original German is actually a mash-up of bits of the Gospel of St. Luke and this specific passage from Acts, and therefore it isn’t strictly a single quotation from the Bible at all! But that it quotes different bits from the Bible as one passage means I nevertheless stitched together the appropriate bits from the King James Bible -and that means referring to ‘Olivet’ not ‘the Mount of Olives’. It also means the word-order used is a tad more archaic than you’d find elsewhere in my translations, constrained as I am at this point by the KJV’s antique approach to English word-order: thus, ‘which is from Jerusalem a sabbath day’s journey’ rather than the more modern ‘which is a sabbath day’s journey from Jerusalem’. It also explains the presence of “unto Jerusalem” when journeying “to” Jerusalem would have been the more modern way of saying the same thing.
4.0 Some Practical Issues
I have not generally preserved indentations in any of my texts, except when Bach’s librettists have two characters exchanging words (say, the Soul talking with Jesus). In those cases, indentations indicate one voice has stopped talking and the other has begun (or vice versa!). Otherwise, the texts are rendered as left-flush paragraphs, without differentiation.
I have generally ditched the common practice of capitalising ‘You’ when referring to persons of the Holy Trinity. It remains true that God, Jesus and the Holy Ghost get capitalised as proper names, of course; but in phrases such as ‘I pray to you’ or ‘To me he comes’, I do not generally render it ‘I pray to You’ or ‘To me He comes’. I found the proliferation of capital letters that would result from observing historical practice too distracting (especially given that the texts are usually poetical and thus tend to want to start each line with a capital letter at the best of times). I figure that God will understand the aesthetics involved and will forgive the apparent over-familiarity!
I have felt free to terminate sentences (or to end clauses with colons or semi-colons) as seems appropriate. The original poetry often piles up thought-on-thought, line by line, with commas as delimiters. In English, this often results in huge, meandering run-on sentences, where the sub-clauses cannot readily be delimited, so I’ve terminated them when it seemed sensible to do so, resuming the thoughts with entirely new sentences as necessary.
When in doubt, I have resorted to high-flown language, or Biblical language, or Sunday School language. Thus ‘peace’ might be rendered as ‘solace’; or ‘for ever’ might become ‘eternally’ -though the specifics would depend on context: ‘forever and ever, Amen’ as the closing words of an aria would sound entirely correct to a current Church-goer, so would probably be retained. In any event, the aim is always to elevate the language to the sort of style that Bach’s original listeners might have thought acceptable for a Sunday morning, whilst somehow retaining a sense of modern language that could be understood comfortably.
I’m British (and Australian), so my choice of language may at times reflect that heritage. In particular, I draw a distinction between implying something and inferring it; I couldn’t care less; and I have a tendency to write that “I shall do something” when, as I understand it, an American might expect to write “I will do something”. Apparently, “shall” is considered archaic in USA-English, which is fair enough I suppose. However, the difference between ‘will’ and ‘shall’ is significant in two respects. First, there is an implied use of a subjunctive sense when one reads ‘shall’: a mood or sense of compulsion or impulsion, not just a sense of future tense. Such a sense seems to me appropriate to a religious text where God in one form or another is impelling the listener to do something. Second, the very archaic nature of the word that might put off an American has, I think, the effect of ‘heightening’ and ‘solemnising’ the language. I’m afraid left-pondians will just have to deal with it!
For the same sorts of reasons, wherever I have had the ability to outwit my spell-checker, ‘-ize’ is usually rendered ‘-ise’ and words like colour and flavour are spelled correctly [ 🙂 ]. If ever the American spelling nevertheless appears, therefore, it’s because I was tired and hadn’t noticed the devilry attendant upon most spell-checkers these days.
You will not be able to sing my translations: I have made precisely zero effort to make them metrically-equivalent to the German originals. I am not poet enough for that!
For my sins, I am a fan of the poetry of Gerard Manley Hopkins, having been introduced to him through the heroic efforts of my very-former English teacher, Linda Hepburn, now some 40 years ago. Ever since our first introduction, I’ve had a tendency to write or speak assorted compound nouns in English at the drop of a head-hat (in similar vein to GMH’s wonders such as “silk-sack clouds” or “Selfyeast of Spirit”). It is a habit I’ve tried to wean myself from in the intervening years, but I fear that constant exposure to the German language (which has a positive love of compound noun construction) has caused something of a relapse. You may find the occasional reference to ‘heaven-haven’ at the one extreme or ‘dwelling-place’ at the other, when simpler constructions such as ‘home’ might have sufficed in either case. I plead guilty, and ask your indulgence. It is the (bad) poet in me -and the quest for ‘elevated language’ that mirrors the ‘feel’ of the German originals.
There are occasions when preserving an ‘exact translation’ doesn’t seem to me to be wise. Take Cantata BWV 79, for example. Movement 3 has this as the German:
Nun danket alle Gott
Mit Herzen, Mund und Händen,
Der große Dinge tut
An uns und allen Enden,
Der uns von Mutterleib
Und Kindesbeinen an
Unzählig viel zugut
Und noch itzund getan.
You could “translate” that in the “usual manner”. One example that is extant on the web, for example, yields:
Now all thank God
with heart,mouth and hands;
He does great things
for us and all our purposes;
He for us from our mother’s womb
and childish steps
countless great good
has done and still continues to do.
Now, there’s nothing wrong with such an approach (although I personally think the results on this occasion are rather peculiarly clumsy and incoherent), but for an Englishman of a certain vintage, the words immediately imply a well-known hymn (that used, indeed, to be my favourite when sung at morning school assemblies, aged 8!):
Now thank we all our God,
with heart and hands and voices,
Who wondrous things has done,
in whom this world rejoices;
Who from our mothers’ arms
has blessed us on our way
With countless gifts of love,
and still is ours today.
The gist of the ‘translation’ in this version is well-preserved, though there are some obvious “errors”: Herzen, Mund und Händen in line 2 would definitely be more accurately translated as “Hearts, mouth and hands” rather than “heart and hands and voices”, for example. Voices are nowhere in the original! Similarly, “childish steps” is a much more accurate rendition of “Kindesbeinen” than “from our mother’s arms”, “Beinen” being “legs”, implying steps, not arms!
Obviously, Catherine Winkworth who produced the ‘inaccurate’ version back in the 19th Century had poetry and metre as her watchwords, rather than strict textual accuracy. Thus, little deviances from the ‘true translation’ such as these inevitably occur.
Nevertheless, I think it would be perverse in the extreme to ignore the English words that seem in these circumstances most naturally to present themselves, no matter their minor inaccuracies. So the short version here is that wherever the original German is indicative of a piece of hymnody that is well-known in English-speaking lands, therefore, I’ve gone for the traditional hymn words, rather than a word-for-word translation. (This parallels the other ‘rule’, mentioned in Section 3 above, that if the original German is quoting a Biblical passage, I quote the King James equivalent, rather than attempt a word-for-word translation).
This seems to me to make a lot of sense: these sorts of hymn quotations are exactly what Bach was doing, musically, when he composed his many chorales on well-known hymn-tunes of the day. If Bach is quoting hymns his congregations would have recognised, it seems entirely appropriate to do the same sort of things if hymns an English church congregation might know are being referenced!
Another example of this translation technique is BWV 80, where Luther’s great hymn ‘Ein feste Burg ist unser Gott’ has been matched with Frederic Henry Hedge’s well-known 19th Century English translation wherever the Luther original was used by Bach’s librettist.
There is a corollary to this rule, however: I am not hugely well-versed in English hymnody, so it’s quite possible that I’ll miss an obvious hymn-quotation where one might reasonably apply. I can only plead ignorance and ask for forgiveness for my oversight in such event.
Word order can be a problem! For the most part, I try to make my English words follow the German word order, unless that causes the outcome to be perversely long-winded. Occasionally, I will reverse the word order if the sense over a short phrase is improved. For example, see BWV 92. In Movement 1, we have these opening lines:
Ich habe in Gottes Herz und Sinn
Mein Herz und Sinn ergeben[…]
Even a non-German speaker, pre-warned that “ergeben” means “surrender” could probably work out a word-for-word translation:
I have in[to] God’s heart and mind
My heart and mind surrendered.
It’s not a bad translation; indeed, it’s very accurate. You’ll find this slightly-more-polished version elsewhere on the Internet, too:
I have surrendered to God’s heart and mind
my heart and mind.
This translator has felt empowered to move the ‘surrender’ part of the sentence more up-front, which I think helps -but I would still argue that the word order sounds (to my ears) oddly clumsy. A more natural word-order that doesn’t change the fundamental meaning is surely:
I have surrendered my heart and mind
to God’s heart and mind.
When read in parallel with the original German, the word ‘God’ certainly won’t match up in both languages, but nothing else in meaning is affected. It would be rare, I think, for me to do this to a 3-line phrase, such that a very obvious word such as ‘God’ appeared in line 1 in the German and line 3 in the English: that would be word-reordering on steroids, I suspect! I won’t say it’s never happened, but I would go to some lengths to avoid doing this sort of thing over a clause/phrase that spans more than 2 lines.
I should say that if a 2-line “clumsy” word-order nevertheless manages to sound ‘poetic’ enough, I would retain it in a heart-beat, but that didn’t seem to me to apply in this case -and there will be a few other examples about the place where inter-line word-reordering of this sort takes place.
5.0 Machine Translation… No!
I just wanted to mention that, in this wonderful technological world we have made for ourselves, you might have thought that stuffing the German into Google Translate would be sufficient. Sadly, it isn’t. Not even close!
There are too many examples to cite all of them, but there are a couple of choice ones that spring to mind as to ‘how not to translate’!
From BWV 71, the original German reads:
Er kennet deine Not und löst dein Kreuzesband.
Google helpfully translates this as:
He knows your need and loosens your cruciate ligament.
Which is very kind of him -and I daresay there are some former footballers who would have been grateful for his medical attention back in the day. But sadly, what’s actually meant is:
He knows your need, and loosens the bonds of the cross
…which is altogether less helpful for would-be athletes!
6.0 The Point
Having once translated about 25 of the cantatas, a good friend of mine once mentioned in passing, ‘knowing the English doesn’t really help you appreciate the cantatas, though, does it? If anything, quite the opposite’. His point, I think, was that the peculiar mix of militant Lutheranism, 18th Century attitudes to life and Germanic attitudes to others which you’ll find expressed in Bach’s cantata texts can be, if truth be told, rather unpleasant or over-wrought at times.
There are, for example, occasional references to “Protect us like a father from the Turk and the Pope” (BWV 18), which doesn’t fit well in the multi-cultural, multi-ethnic and religiously-tolerant world we have made for ourselves today. More often, we have an awful lot of “woe is me” poetry: “My sighs, my tears cannot be counted. If there is daily sadness and the misery does not disappear, Oh! then this pain must be marking the path to death” (BWV 13), for example. A brief half hour with this sort of stuff is not likely to improve one’s mood, I think! Therefore, I have to agree with my friend, that knowing what the words mean can detract from the loveliness of the music that Bach usually uses to accompany the words.
I therefore gave up wanting to make translations for at least 4 years.
I eventually resumed translating, however, because in the end, I decided that knowing the sense of the words could be important to understanding the music gestures Bach often makes. For example, in BWV 7, we have repeated semiquaver figures in the continuo that are highly distinctive. But what do they mean, or what do they represent? Well, maybe you’d hear them as oar-like splashes in water regardless of the words or context; once you know the text is about Jesus coming to the River Jordan and being baptised with water by St. John the Baptist, however, the significance of the water-splash figure becomes that much more obvious. More generally, if you knew that the text says ‘I am a lowly maiden, but I give praise to the Lord on High’, the appearance of a semi-quaver run from the lowest possible note to the highest becomes rather more understandable -and your ear is, I think, likely to hear it better because your eyes have read the words and primed your ears accordingly.
In other words, I probably wouldn’t want to listen to any cantata clutching the translation in hand at the same time: too often, the pretty awful words are definitely capable of distracting from the glorious music! But I think having a sense of what a cantata is about ahead of time, and being aware of the language and imagery it uses to describe things, can be a helpful enhancement to one’s appreciation of the music’s subtleties. I hope so, anyway, otherwise I’ve just wasted a few years of my life!
7.0 Some Examples
I wanted to cite a couple of examples where, it seems to me, translations need to tread carefully! I should say at the outset that in no way do I mean to trivialise or ridicule the other translations I’m about to cite: it’s a tricky game and one in which it’s more true than in many other endeavours that one stands upon the shoulders of giants. So if I point out problems in these other translations (which I inevitably hope I’ve fixed in my own), it’s only to be taken as illustrative of the issues involved, not as a joke at the expense of others.
Gott, ach Gott, verlaß die Deinen
This has been translated variously as:
God, ah God, abandon Your own ones
God, ah God, forsake Your own people
God, ah God, forsake your people
Now the trouble with this little example is specifically ‘Nimmermehr’, which might be literally translated as “nevermore, never again” -and we can see those two literal translations being variously used in all three translations I’ve pulled from elsewhere on the ‘Net. The trouble specifically is that if you say “God, forsake us never again”, it would imply that he has forsaken us previously and the plea is for him not to do so in future… and that is, theologically, er… problematic! Protestant and Catholic theology would, I am sure, claim that God has never abandoned us, so a plea for him not to do so “again” would have been practically heretical in Bach’s day!
No, the real problem here is the word “nevermore”, probably because it isn’t really in current usage. If you look it up in a dictionary, you will find this:
never again; never thereafter:
…which doesn’t seem to help much… except that the second meaning listed (‘never thereafter’) could usefully imply that the meaning ‘not from this point on’, without that implying a previous occurrence. In fact, if you were to drop a ‘t’ and say the relevant meaning was never hereafter, that would tick all the right boxes: ‘don’t forsake your people at any time in the future’ would not imply that he’d ever previously abandoned them. It’s just a prayer for an insurance policy for the future, which is much less theologically problematic.
Hence my own translation of these lines is:
God, oh God, never hereafter
abandon your own people!
I’m not saying it’s perfect, but I do say that my translation manages to avoid a nasty implication (of previous abandonment) that two of the earlier translations explicitly do not (and the third one… well, it’s using the word ‘nevermore’ which no-one is likely to be familiar with, so who knows whether it’s implying previous abandonment or not!).
Sie ist den Sodomsäpfeln gleich,
Und die sich mit derselben gatten,
Again, three different translations of these two lines as found in various sources:
It is like Sodom’s apples,
And those who consort with them (Dürr)
It is like the apples of Sodom
and those who join with it (Browne)
It is like the apples of Sodom,
and those who engage themselves with it (Emmanuel)
So all versions mention apples, Sodom and some sort of conjoining, engaging with or consorting. It’s in the context of a description of ‘the nature of wicked sin’… it’s like, er, something to do with Sodom’s apples.
The literal translation of the original German as provided by a dumb machine translation tool such as Google Translate is:
It’s like Sodom apples,
And who are married to the same
If you take that second line word-by-word, you get something more like “And they themselves with the same (type of) spouse” …and given the context is ‘the nature of the wicked sin of Sodom’, and knowing that the English word ‘sodomy’ is pretty explicit about what that ‘wicked sin’ was, I think we can get pretty close to the intended meaning with a translation such as :
It’s like the fruit of Sodom,
where people engaged in homosexual relations
But this is swerving a little bit to the ‘too literal’ school of translation, I feel! I think mention of ‘the fruits of Sodom’ is all one really needs to say (and all Bach’s librettist needed to say) to get the message across: everyone in early 18th Century Europe would know what sort of sexual horrors were being recalled by that mention of the infamous City of the Plain. We are then left with a fruit-of-the-tree metaphor to deal with -and it seemed to me highly appropriate to consider what one does with apples: eat them! Thus, we arrive at my actual translation of this passage:
It is like the apples of Sodom
and those who consume them:
This avoids the grammatical howler present in both Emmanuel and Browne translations: ‘apples of Sodom… and those who join it‘ or ‘…those who engage themselves with it‘, which is a mixup of pronouns as far as I can tell: surely, if you’re going to ‘join with’ or ‘engage with’ apples in the plural, you should then join or engage with them, not it? It also avoids the faintly ludicrous imagery associated with “consorting with apples” as mentioned in Dürr. My version does dodge an explicit mention of what was going on in Sodom, true enough; but it does pretty clealrly imply people ‘consuming the fruit of Sodom’, which is (I think) fairly unambiguous, without getting too medical about it, and keeping it suitably poetical and high-flown.
I should also mention in passing at this point that ‘Sodom’s Apples’ is actually a real plant: it bears fruits which look like the real thing, but are essentially a puff-ball that disappears into a pile of ashes if you touch them. So one could read the German in a literal sense, where Sodomsäpfeln is a proper noun. I just don’t think it makes much sense to do so here, though, as ‘consorting with’ or ‘joining with’ or ‘engaging with’ a piece of real fruit is a piece of imagery I just can’t get my head around! Compare it to BWV 95, however, where the sense really is of a literal fruit being harvested. Harvesting a fruit I can cope with; ‘consorting with fruit’ is more tricky! So in BWV 54, I do believe a metaphorical interpretation of ‘Sodom’s Apples’ is required, rather than a literal one.
8.0 Source of German
It would be a lie if I said I carefully typed all my original German texts in from Bach’s manuscripts! A big one, too!!
I have obtained my source German from https://webdocs.cs.ualberta.ca/~wfb/cantatas/bwv. I have then compared it to the German in Dürr by hand. If there are differences of spelling or punctuation, Dürr wins.
The process is semi-automated by initially running the following bash script to obtain a ‘fair copy’ of the German for any given cantata:
#!/bin/bash # Bash script to obtain original German source text for each of # Bach's cantatas in turn. Requires a BWV number at the command # line when run. So, for example: # # ./sourcebach.sh 21 # # ...will obtain and process the text for BWV 21. The output file # will be stored in your current working directory (the one you # were in when you invoked the script). source_url="https://webdocs.cs.ualberta.ca/~wfb/cantatas/" doc_url="$1.html" wget "$source_url"/"$doc_url" sed -i 's/<[^>]*>//g' "$doc_url" sed -i 's/ //g' "$doc_url" sed -i 'N;/^\n$/D;P;D;' "$doc_url" sed -i 's/Ä/Ä/g' "$doc_url" sed -i 's/ä/ä/g' "$doc_url" sed -i 's/É/É/g' "$doc_url" sed -i 's/é/é/g' "$doc_url" sed -i 's/Ö/Ö/g' "$doc_url" sed -i 's/ö/ö/g' "$doc_url" sed -i 's/Ü/Ü/g' "$doc_url" sed -i 's/ü/ü/g' "$doc_url" sed -i 's/ß/ß/g' "$doc_url" sed -i "$(($(wc -l < "$doc_url")-17)),$d" "$doc_url" mv "$doc_url" "$(basename "$doc_url" .html).txt"
It looks complicated, but you basically invoke the script with the BWV number whose German text you’re wanting to obtain and everything happens automatically from that point on. For example:
…will get you a fairly clean copy of the German for BWV 101.
The first few lines simply invoke wget to download the source text in HTML format. The sed lines then strip the downloaded file of its various HTML characteristics, to turn it into a plain text file. The first sed line strips out all “< … >” HTML tags and anything between the < and > symbols in each case, for example. The second sed line strips out forced indentations in the original caused by inserted “nbsp;” codes. The third sed line replaces multiple contiguous blank lines with single blank lines. The fourth sed line onwards turns HTML code for foreign characters into the actual foreign characters (so “ß” becomes “ß”, for example).
The last sed command removes the last 18 lines from the file (these always contain the ‘footer’ from the original web page and are of no further interest as far as translation work is concerned). The command uses “17” in its calculations because it starts counting from 0, so 17 implies 18, and so on.
The last line (“mv…” renames the downloaded file so that it is now a .txt file, rather than a .html one.