Induction to the max. Michael Cysouw Philipps-University Marburg

Similar documents
Language comparison through massively parallel texts. Michael Cysouw Philipps-Universität Marburg

Bottom-up- und Top-down-Zugänge zu sprachvergleichenden Korpusanalysen. Michael Cysouw & Simon Kasper Philipps-Universität Marburg

"Onse Vader wat in die hemele is, laat u Naam geheilig word; laat u koninkryk kom; laat u wil geskied, soos in die hemel net so ook op die aarde; gee

DOWNLOAD OR READ : THE KORAN PENGUIN CLASSICS PDF EBOOK EPUB MOBI

Dr. Martin Luthers Lære Om Retfærdiggjørelsen (Danish Edition) By Martin Luther;Ch Holfeldt Houen

Bybel vir Kinders. bied aan. Die vrou by die put

Isaiah 38:19 19 The living, the living, he thanks you, as I do this day; the father makes known to the children your faithfulness.

Excelsus (die Sentrum vir Bedieningsontwikkeling) bied in 2019 die volgende aan:

About the history of the project Naatsaku

Bybel vir Kinders bied aan. God Toets Abraham se Liefde

10 Woorde wat ons ewige redding en verlossing beskryf

BIBLE KING JAMES VERSION ANNOTATED BIBLE KING JAMES VERSION PDF KING JAMES VERSION - WIKIPEDIA THE ORIGINAL KING JAMES BIBLE 1611 PDF ORIGINAL BIBLES

chakra guide, chakra for beginners, chakra How To Choose Crystals. Crystal Healing guide for Life" while the word Ki means "Energy.

The Gathering of God s People

Van Dale Comprehensive Dutch To English And English To Dutch Dictionary In Four Volumes / Van Dale Grote Woordenboeken Nederlands - Engels / Engels -

GENERAL CONGREGATION 36 rome // 2016

n Verduidelijking van die Nuwe Verbond deur Dr. Marc s. Blackwell Sr.

HOSEA JOEL AND AMOS conzentrate.dk HOSEA JOEL AND AMOS HOSEA JOEL AND AMOS PDF HOSEA - WIKIPEDIA BOOK OF JOEL - WIKIPEDIA 1 / 6

Welkom by ons Aanddiens! Kom geniet n koppie koffie in die saal na die diens!

Where In The World Is God? By Rosemarie Kunzler-Behncke READ ONLINE

BEGIN BY DIE EINDE: Wat moet met jou gebeur as jy doodgaan?

Bybel vir Kinders. bied aan. Die Verlore Seun

University of Groningen. The force of dialectics Glimmerveen, Cornelis Harm

Van Vervolger tot Prediker

11 good reasons for the taz* * abbreviation for taz possibly the best loved national newspaper in Germany

Bhagavad-Gita As It Is By A. C. Bhaktivedanta Swami Prabhupada READ ONLINE

W. BANG S NOTE ON MF 18, 25 FF.

GROEIGROEP MATERIAAL BADBOYS OM DIE KRUIS KAJAFAS

Dit bring ons by ons tweede handvatsel in `n strewe na die leef van die Koninkryk Kultuur nl: Genade pad.

Allah: A Christian Response By Miroslav Volf READ ONLINE

yuval noah harari 085C224C5CD3A71A4BB586F2C260C4AF Yuval Noah Harari 1 / 6

Presupposition Projection and At-issueness

Mendelssohn and the Voice of the Good Shepherd

1. OM JESUS TE VOLG: 2. DTR die verhouding:

People: ORDER OF WORSHIP FIRST SUNDAY IN LENT *HYMN O Love, How Deep, How Broad, How High DEUS TUORUM MILITUM

Language Diversity friend or foe? Michael Cysouw Philipps-Universität Marburg

Catharina Maria Conradie

THE FIVE MEGILOTH ESTHER SONG OF SONGS RUTH ECCLESIASTES LAMENTATIONS

MEGILOTH 2: ESTER / ESTHER

Christ Church Grosse Pointe

Es Pennsilfaanisch Deitsch Eck (Originally published in the May 27, 1981 issue of The Shopping News)

The Commemoration of all the Faithful Departed (All Souls)

Spuren. Meeting 7 november Proverbs from Faust by Goethe. Illustrations and interpretations: by pupils of 12th class, Steinerschool Gent

DOWNLOAD OR READ : THOMAS KEMPIS PDF EBOOK EPUB MOBI

RECONSIDERING EVIL. Confronting Reflections with Confessions PROEFSCHRIFT

IN THE SUPREME COURT OP SOUTH AFRICA. CORBETT, MILLER, JJA et NICHOLAS, AJA

Meaning of these indefinite pronous in German: with the plural

Bybel vir Kinders bied aan. Die Hemel God se pragtige huis

Die Hemel God se pragtige huis

Beyond Sklavenmoral - Kanamaru Toshiyuki and Harry Graf Kessler

Entailment as Plural Modal Anaphora

'n GEMEENTE VAN GOD MET JESUS CHRISTUS AS HOEKSTEEN

AFRIKANER WEERSTANDSBEWEGING THE SOUTH AFRICAN BROADCASTING CORPORATION

Real predicates and existential judgements

KJV THE KING JAMES STUDY BIBLE LEATHERSOFT BROWN FULL COLOR EDITION

Still alive The True Story of The Dinosaurs

Macmillan/McGraw-Hill SCIENCE: A CLOSER LOOK 2011, Grade 4 Correlated with Common Core State Standards, Grade 4

DOWNLOAD OR READ : MARTIN LUTHER AN ILLUSTRATED BIOGRAPHY PDF EBOOK EPUB MOBI

Judaism In Late Antiquity: Death, Life-After-Death, Resurrection And The World-To-Come In The Judaisms Of Antiquity (Handbook Of Oriental

DIE GODHEID Matt 28:19 veelgodery.

Truth: A Guide For The Perplexed By Simon Blackburn

Classroom WithOut Walls

Restoration Through Redemption:John Calvin Revisited (Studies In Reformed Theology) By Henk van den Belt READ ONLINE

2 Two accounts of German FP-Syntax. Reis (2005): On the Syntax of so-called Focus Particles in German. A reply to Büring and Hartmann 2001

Die Gute Nachricht Die Evangelisch Lutherische St. Matthäusgemeinde

Mark 11:1-7. Jesus se intog in Jerusalem, en wat Hy daarmee aan ons openbaar (a).

4. Struktuur Van Profesie

ISRAELS PROPHETS AND THE PROPHETIC EFFECT OF POPE FRANCIS A PASTORAL COMPANION

OF THE REFORMATION. October 25, :00 p.m.

Four Proposals for German Clause Structure

Die regering van die Kerk 1Tim 2: Christus se wil dat vroue leerlinge moet wees... maar nie self mag onderrig gee nie.

John Adams: A Life By John Ferling

DE DANSK VESTINDISKE NISSER EN JULEFORTAELLING FOR VOKSNE

To fulfill. To complete its purpose. He was the end of the law. It was a "schoolmaster to bring us to Christ"

Michael Thompson: Life and Action Elementary Structures of Practice and Practical Thought, Cambridge/MA

Instructional Materials Evaluation Review for Alignment in Social Studies Grades K 12

THE BOOK OF MORMON An Account Written By The Hand Of Mormon Upon Plates: Taken From The Plates Of Nephi By Joseph Smith READ ONLINE

Kain vermoor Abel (Genesis 4:8)

Nineteenth Sunday after Pentecost October 4, 2015

The Religion Of Islam By Maulana Muhammad Ali READ ONLINE

The is the best idea/suggestion/film/book/holiday for my. For me, the is because / I like the because / I don t like the because

Twee van die grootste leuens oor sukses wat aan ons en aan ons kinders deur die samelewing vertel word.

n Prins word die Skaapwagter

MOTHEO/XHARIEP HOëRSKOLE ATLETIEK MOTHEO/XHARIEP HIGH SCHOOLS ATHLETICS

VOOR: STAATt TOLK: AFRIKA. DIESTAAT teen: SY EDELE REGTER YAK DIJKHORST ASSESSORS: MNR. V.F. KROGEL PBOF.V.A, JOUBERT

Louis C. Jonker University of Stellenbosch Stellenbosch, South Africa

DOELSTELLING DANKIE TERUGVOER

(Uit Leef stroom-op! hoofstuk 1)

Psalm 121: Bibelcapitel F By Agnes de Bezenac

Catullus se Carmina in Afrikaans vertaal: n funksionalistiese benadering

Thirty-second Sunday in Ordinary Time. November 8, am Mass

Gen 17:1-14; Rom 4:1-12; Kol 2:1-12

Preek Jan Steyn op Sondag 25 Maart 2012 Teks: Johannes 12:20-36 en Johannes 3: Tema: Vreemde verheerliking

Clashes of discourses: Humanists and Calvinists in seventeenth-century academic Leiden Kromhout, D.

DOWNLOAD OR READ : LAY CONFRATERNITIES AND CIVIC RELIGION IN RENAISSANCE BOLOGNA PDF EBOOK EPUB MOBI

A View Of The Vatican: French Language Edition By Carla Cecilia READ ONLINE

Belowe God regtig dat Hy dit altyd met jou goed wil laat gaan?

QUESTIONING GÖDEL S ONTOLOGICAL PROOF: IS TRUTH POSITIVE?

Jy sal lewe deur die onverdeelde trou van die Here. Jesaja 36-37:14, 20, 32

Fatoş EREN ÖZDEMIR & Ahmet Bilal ÖZDEMIR. Leipzig University, Department of Linguistics The Morphosyntax of Upward Agreement and Downward Agreement

Transcription:

Induction to the max Michael Cysouw Philipps-University Marburg

Introducing the Parallel Text Corpus Michael Cysouw Philipps-University Marburg

Basic Problem of Language Comparison How to compare like with like Domain, Tertium Comparationis, Comparative Concept, Function, etc. Even better: Contextually Situated Exemplars Stimuli-based elicitation, translational equivalence

Parallel Bible Corpus 1169 translations 906 different ISO-639/3 codes In total more than 350 Million wordforms More than 17 Million different wordforms http://paralleltext.info/data

Demo

Software Contact me personally for access to the data R-package qlcmatrix http://cran.r-project.org/web/packages/qlcmatrix/index.html https://github.com/cysouw/qlcmatrix Python library https://github.com/tmayer/paralleltextprocessing

Multiple Alignment Based on sentence-by-sentence alignment, induce word-by-word alignment Translations can be (and often are!) quite different Bi-text alignment is widely researched problem Mulit-text alignment not so much (but multi-string alignment in bio-informatics is!)

Kong Herodes blev skrækslagen, og Jerusalem begyndte at summe af rygter. Als dies dem König Herodes zu Ohren kam, erschrak er, und mit ihm entsetzte sich auch ganz Jerusalem. But Herodes the king heard, and was troubled, and all Urishlem with him. Þegar Heródes heyrði þetta, varð hann skelkaður og öll Jerúsalem með honum. Als des dr Kenig Herodes ghärt het, isch scha ( er ) arg vuschrocke un mit nem ganz Jerusalem, Kort voor lank het Herodes ook van die geleerdes uit die ooste se storie te hore gekom. Hy was baie omgekrap oor wat hulle oor die nuwe Joodse koning gesê het. So ook die res van Jerusalem. Der König Herodes war total aufgebracht, als er das hörte, und nicht nur er, alle in Jerusalem waren das.

Kong Herodes blev skrækslagen, og Jerusalem begyndte at summe af rygter. Als dies dem König Herodes zu Ohren kam, erschrak er, und mit ihm entsetzte sich auch ganz Jerusalem. But Herodes the king heard, and was troubled, and all Urishlem with him. Þegar Heródes heyrði þetta, varð hann skelkaður og öll Jerúsalem með honum. Als des dr Kenig Herodes ghärt het, isch scha ( er ) arg vuschrocke un mit nem ganz Jerusalem, Kort voor lank het Herodes ook van die geleerdes uit die ooste se storie te hore gekom. Hy was baie omgekrap oor wat hulle oor die nuwe Joodse koning gesê het. So ook die res van Jerusalem. Der König Herodes war total aufgebracht, als er das hörte, und nicht nur er, alle in Jerusalem waren das.

Kong Herodes blev skrækslagen, og Jerusalem begyndte at summe af rygter. Als dies dem König Herodes zu Ohren kam, erschrak er, und mit ihm entsetzte sich auch ganz Jerusalem. But Herodes the king heard, and was troubled, and all Urishlem with him. Þegar Heródes heyrði þetta, varð hann skelkaður og öll Jerúsalem með honum. Als des dr Kenig Herodes ghärt het, isch scha ( er ) arg vuschrocke un mit nem ganz Jerusalem, Kort voor lank het Herodes ook van die geleerdes uit die ooste se storie te hore gekom. Hy was baie omgekrap oor wat hulle oor die nuwe Joodse koning gesê het. So ook die res van Jerusalem. Der König Herodes war total aufgebracht, als er das hörte, und nicht nur er, alle in Jerusalem waren das.

Kong Herodes blev skrækslagen, og Jerusalem begyndte at summe af rygter. Als dies dem König Herodes zu Ohren kam, erschrak er, und mit ihm entsetzte sich auch ganz Jerusalem. But Herodes the king heard, and was troubled, and all Urishlem with him. Þegar Heródes heyrði þetta, varð hann skelkaður og öll Jerúsalem með honum. Als des dr Kenig Herodes ghärt het, isch scha ( er ) arg vuschrocke un mit nem ganz Jerusalem, Kort voor lank het Herodes ook van die geleerdes uit die ooste se storie te hore gekom. Hy was baie omgekrap oor wat hulle oor die nuwe Joodse koning gesê het. So ook die res van Jerusalem. Der König Herodes war total aufgebracht, als er das hörte, und nicht nur er, alle in Jerusalem waren das.

Multiple Alignment Small-scale experiment: use fastalign for bitextalignment on all pairs, build multi-text-alignment from there Only for 77 Germanic translations New Testament produced almost 100.000 Germanic alignments, which are directly comparable words

trees and wood

afr-x-bible-1953.txt boom bome hout kruishout vyeboom Louis Hjelmslev Prolegomena to a Theory of Language (1963)

baum holz bäume feigenbaum deu-x-bible-erben.txt

baum bäume baume holz holze feigenbaum deu-x-bible-freebible.txt

tre treet trærne trær fikentreet nob-x-bible-2007.txt

where

wo woher wohin dort da deu-x-bible-pattloch.txt

where whence whither there from when eng-x-bible-kingjames.txt

where whence there whither eng-x-bible-darby.txt

where there place wherever eng-x-bible-treeoflife.txt

där var varifrån vart dit plats swe-x-bible-folk1998.txt

Indefinite person (someone, anyone)

iemand niemand wie een ieder nld-x-bible-1951.txt

nogen den en ingen som hver ikke dan-x-bible-1931.txt

any one no whosoever that he some man every eng-x-bible-darby.txt

Lexical comparison

Verse 43019036 (John 19:36): These things happened so that the scripture would be fulfilled : " Not one of his bones will be broken, " been, ben, been, bein, bein, bein, knochen, bein, knochen, bein, bein, bein, knochen, knochen, knochen, bein, knochen, bein, bein, knochen, bones, bones, bones, bones, bone, bones, bones, bones, bones, bone, bone, bones, bones, bones, bones, bones, bones, bone, bones, bones, been, beenderen, botten, been, bein, ben, ben, ben, bein, ben, bein, ben, ben, ben

Different contexts give different cognates! Matthew 1:16 Matthew 1:19 Matthew 19:10 Mark 10:12 afr-x-bible-boodskap.txt man man mens man afr-x-bible-1953.txt man man man man deu-x-bible-volxbibel.txt mann mann ehemann eng-x-bible-common.txt husband husband man man eng-x-bible-literal.txt husband husband husband husband

Conclusion Massively parallel texts are a goldmine for language comparison Much experimentation is needed to find suitable methods to enrich the data Collaboration welcome (using git-approach)