dogelinguistics

The doge meme teaches us so much about language learning and how challenging it can be to accurately combine words and patterns when using another language. The FLAX language system teaches us so much about how we can avoid using dodgy language by employing powerful open-source language analysis tools and authentic language resources.

flaxHeader_leftlinkedup trophyThe FLAX (Flexible Language Acquisition) project has won the LinkedUp Vici Competition for tools and demos that use open or linked data for educational purposes. This post is the one I wrote to accompany our project submission to the LinkedUp challenge.

FLAX is an open-source software system designed to automate the production and delivery of interactive digital language collections. Exercise material comes from digital libraries (language corpora, web data, open access publications, open educational resources) for a virtually endless supply of authentic language learning in context. With simple interface designs, FLAX has been designed so that non-expert users — language teachers, language learners, subject specialists, instructional design and e-learning support teams — can build their own language collections.

The FLAX software can be freely downloaded to build language collections with any text-based content and supporting audio-visual material, for both online and classroom use. FLAX uses the Greenstone suite of open-source multilingual software for building and distributing digital library collections, which can be published on the Internet or on CD-ROM. Issued under the terms of the GNU General Public License, Greenstone is produced by the New Zealand Digital Library Project at the University of Waikato, and developed and distributed in cooperation with UNESCO and the Human Info NGO.

REMIX WITH FLAX

images_entries_entry_image_file_-_entry_id-4433_-_20111221124909164.w_420.h_280.m_crop.a_center.v_topAt FLAX we understand that content and data vary in terms of licensing restrictions, depending on the publishing strategies adopted by institutions for the usage of their content and data. FLAX has, therefore, been designed to offer a flexible open-source suite of linguistic support options for enhancing such content and data across both open and closed platforms.

Featuring the Latest in Artificial Intelligence &

Natural Language Processing Software Designs

Within the FLAX bag of tricks, we have the open-source Wikipedia Miner Toolkit, which links in related words, topics and definitions from Wikipedia and Wiktionary as can be seen below in the Learning Collocations collection  (click on the image to expand and visit the toolkit in action).

wikiminer
Wikipedia Mining Tool in FLAX Learning Collocations Collection – click on the image to expand and visit the collection

Featuring Open Data

Available on the FLAX website are completed collections and on-going collections development with registered users. Current research and development with the FLAX Law Collections is based entirely on open resources selected by language teachers and legal English researchers as shown in the table below. These collections demonstrate how users can build collections in FLAX according to their interests and needs.

Law Collections in FLAX

flaxheadercropped

Type of Resource

Number and Source of Collection Resources

Open Access Law research articles
40 Articles (DOAJ – Directory of Open Access Journals, with Creative Commons licenses for the development of derivatives)
MOOC lecture transcripts and videos (streamed via YouTube and Vimeo)
4 MOOC Collections: English Common Law (University of London with Coursera), Age of Globalization (Texas at Austin with edX), Copyright Law (Harvard with edX), Environmental Politics and Law (OpenYale)
Podcast audio files and transcripts (OpenSpires)
15 Lectures (Oxford Law Faculty, Centre for Socio-Legal Studies and Department of Continuing Education)
PhD Law thesis writing
50-70 EThoS Theses (sections: abstracts, introductions, conclusions) at the British Library (Open Access but not licensed as Creative Commons – permission for reuse granted by participating Higher Education Institutions)
British Law Reports Corpus (BLaRC)
8.8 million-word corpus derived from free legal sources at the British and Irish Legal Information Institute (BAILII) aggregation website
FLAX Wikipedia English
Linking in a reformatted version of Wikipedia (English version), providing key terms and concepts as a powerful gloss resource for the Law Collections.
FLAX Learning Collocations
Linking in lexico-grammatical phrases from the British National Corpus (BNC) of 100 million words, the British Academic Written English corpus (BAWE) of 2500 pieces of assessed university student writing from across the disciplines, and the re-formatted Wikipedia corpus in English.
FLAX Web Phrases
Linking in a reformatted Google n-gram corpus (English version) containing 380 million five-word sequences drawn from a vocabulary of 145,000 words.

FLAX Training Videos

Featuring Game-based Activities

Click on the image below to explore the different activities that can be applied to language collections in FLAX.
flaxactivitiesrevised1flaxactivitiesrevised2

FLAX Apps for AndroidAbout FLAX

We also have a suite of free game-based FLAX apps for Android devices. Now you can interact with the types of activities listed above while you’re learning on the move. Click on the FLAX app icon to the right to access and download the apps and enjoy!

 collocsmatchingapp  collocmatchingapp

FLAX Research & Development

oerresearchhubTo date, we have distributed the English Common Law and the Age of Globalization MOOC collections in FLAX to thousands of registered learners in over a 100 countries – wow!

A collaborative investigation is underway with FLAX and the Open Educational Resources Research Hub (OERRH), whereby a cluster of revised OER research hypotheses are currently being employed to evaluate the impact of developing and using open language collections in FLAX with informal MOOC learners as well as formal English language and translation students.

Princess Mary, Girl Guides, 1922 via Wikimedia Commons

Hey, I’m not even British but as part of Open Education Week – March 5-11 – I’ve just signed a pledge with the new UK-based Open Education SIG, an international special interest group with a UK flavour (not flavor:).

I attended a meeting held at the Open University in the UK at the end of February to discuss the future of open education in the UK. I am a teaching fellow with the Support Centre for Open Resources in Education (SCORE), one of about 400 people working in UK higher education who have been involved in government-funded open educational resources (OER) projects over the last three years. When we all made our applications for funding to the Joint Information Systems Committee (JISC) and the Higher Education Academy (HEA) in the UK we also made the usual commitment in our proposals to sustaining our OER projects after their funded lifetimes. So, what better way to reinforce this commitment than by signing a renewed pledge to Open Education? While the Cape Town Open Education Declaration has been picked up by many organisations around the world we thought it would be a good idea to re-mix this declaration to make it more personalised for the educational practitioner.

What does this all mean for English language teaching practitioners?

Frontrunners for technology-enhanced ELT, Russell Stannard and David Deubelbeiss, have also been pushing for more open educational resources and practices within ELT.

Recently, I posted a comment on Scott Thornbury’s A-Z of ELT blog regarding the issues of attribution, re-use and the making of derivative resources for teaching English based on original resources created by another author:

One of the things that interests me most about this post and the comments related to it is the issue of attribution to the original work on automaticity by Gatbonton and Segalowitz. Attribution is essential whether you’re sharing resources in closed teaching and learning environments (e.g. classrooms, password-protected virtual learning environments, workshop and continuing professional development spaces) or through publishing channels using copyright or copyleft licences (e.g. books, research articles, blogs, online forum discussions). There is obviously a great amount of sharing and attribution going on in this discussion and the blogging platform is an enabler for this activity.

What also interests me is the behaviour around resource enhancement. As Scott outlines in the example here, an original resource from a research article by Gatbonton and Segalowitz was re-formatted into a workshop by Stephen Gaies (presumably with attribution to Gatbonton and Segalowitz). This in turn inspired Scott to engage in further resource gathering to inform his teaching practice while applying the five criteria for automaticity, and this further informed the section on fluency in his book, How to Teach Grammar (presumably with attribution to Gaies but now he realises he should’ve included attribution to Gatbonton and Segalowitz). In its latest iteration we find the same criteria for automaticity here in his blog post containing more ideas on how to apply this approach in language learning and teaching from both Scott and his blogpost readers. This is a great example of resource enhancement via re-use and re-mixing, something which the creative commons suite of licences http://creativecommons.org/ allow materials developers and users to do while maintaining full legal attribution rights for the original developer as well as extended rights to the re-mixer of that resource to create new derivative resources.

Legally enabling others to openly re-mix your resources and publish new ones based on them was not possible back in 1988. Arguably, Gatbonton and Segalowitz’s paper with the original criteria on automaticity has stood the test of time because of its enhancement through sharing by Gaies and by the same criteria having been embedded in a further published iteration by Scott in How to Teach Grammar. Times have changed and there is a lot we can now do with digital capabilities for best practice in the use and re-use of resources with attribution still being at the core of the exchange between resource creation and consumption. Except that now with self-publishing and resource sharing platforms, including blogs, it’s a lot easier for all of us to be involved in the resource creation process and to receive attribution for our work in sharing. This coming week, March 5-10, is Open Education Week http://www.openeducationweek.org/ with many great resources on how to openly share your teaching and learning resources along with how to locate, re-use, re-mix and re-distribute with attribution those open educational resources created by others. Why not check it out and see how this activity can apply to ELT?

If you’re new to all of this and have any pesky questions about the business models behind open education, please check out Paul Stacey’s blog, Musings on the Edtech Frontier, with his most recent post on the Economics of Open. Information on what the different Creative Commons and Public Domain licences can be found at CreativeCommons.org.

publicdomain
Public Domain licence via Flickr
creativecommons
Creative Commons licence via Flickr
Attribution, Creative Commons licence
Attribution Creative Commons licence via Flickr
noncommercial
Non Commercial Creative Commons Licence via Flickr
sharealike
Share Alike Creative Commons Licence via Flickr
noderivatives
No Derivatives Creative Commons licence via Flickr

 

 

 

 

 

 

 

 

 

 

 

 

So, why the interest in British resources for open English?

I’ve been coming in and out of the UK for the past 10 years with my work related to technology-enhanced ELT and EAP. Resources include not only those artifacts that we teach and learn with but also the vibrant communities that come together to share their understandings with peers through open channels of practice. BALEAP, formerly a British organisation (the British Association for Lecturers in English for Academic Purposes) but now with an outreach mandate to become the global forum for EAP practitioners, is such an informal community of practice. Members within BALEAP are actively making up for a deficit in formal EAP training by providing useful resources to both EAP teachers and learners via their website and through lively discussions relevant to current issues in EAP via their mailing list.

Because of my interest in corpus linguistics and data-driven language learning, I’ve also been working with exciting practitioners from the world of computer science, namely those working at the open source digital library software lab, Greenstone, at the University of Waikato in New Zealand, to help with the testing and promotion of their open English language project, FLAX (the Flexible Language Acquisition project). The FLAX team are building open corpora and open tools for text analysis using a combination of both open and proprietary content. A copyrighted reference corpus such as the British National Corpus (BNC) is enhanced within the FLAX project by being linked to different open reference corpora such as a Wikipedia and a Web-derived corpus (released by Google) as well as specialist corpora, including the copyrighted British Academic Written English (BAWE) corpus, developed by Nesi, Gardner, Thompson and Wickens between 2004-2007 and housed within the Oxford Text Archive (OTA).

Oxford University Computing Services (OUCS) manage the OTA along with jointly managing the BNC which is physically housed at the British Library. The OpenSpires project is also based at the OUCS and this is where Oxford podcasts have been made openly available through creative commons licences for use and re-use in learning and teaching beyond the brick-n-mortar that is Oxford’s UK campus. Try out the Credit Crunch and Global Recession OER that are based on an Oxford seminar series and have been enhanced with corpus-based text analysis resources. Or, make your own resources based on these same seminars to share with your own learning and teaching communities. In addition to being housed on the OUCS website these resources, along with many other creative commons-licensed resources from educational institutions around the world, can also be found on the Apple channel, iTunesU.

So, it seems there’s quite a bit going on with open English in the UK that’s worth engaging with, and maybe even making a commitment to sharing with open educational resources and practices.

A finale take-away

Check out FLAX’s new Learning Collocations collection where you can compare collocations for keyword searches and harvest useful phrases to embed into your writing, using the BAWE and the BNC along with corpora derived from Wikipedia and the Web. There are three training videos on how to use the Learning Collocations collection in FLAX available in the Training Videos section of this blog.