Radio Ga Ga by Queen via YouTube
Radio Ga Ga by Queen via YouTube

This is the first satellite post from the mothership post, Radio Ga Ga: corpus-based resources, you’ve yet to have your finest hour. I have also made the complete hyperlinked post (in five sections) available as a .pdf on Slideshare.

Radio 1

Original, in-house and live, this station brings us what’s new in the world of OER for corpus-based language resources.

Flipped conferencing

Kicking things off in late March with Clare Carr from Durham, we co-presented an OER for EAP corpus-based teacher and learner training cascade project at the Eurocall CMC & Teacher Education Annual Workshop in Bologna, Italy. This was very much a flipped conference whereby draft presentation papers were sent to be read in advance by participants and where the focus was on discussion rather than presentation at the physical event. Russell Stannard of Teacher Training Videos (TTV) was the keynote speaker at this conference and I have been developing some training resources for the FLAX open-source corpus collections which will be ready to go live on TTV soon. New collections in FLAX have opened up the BAWE corpus and have linked this to the BNC, a Google-derived n-gram corpus as well as Wikimedia resources, namely Wikipedia and Wiktionary. These collections in FLAX show what’s cutting edge in the developer world of open corpus-based resources for language learning and teaching.

Focusing on linked resources: which academic vocabulary list?

In a later post, I will be looking at Mark Davies’ new work with Academic Vocabulary Lists based on a 110 million-word academic sub corpus in the Corpus of Contemporary American (COCA) English – moving away from the Academic Word List (AWL) by Coxhead (2000) based on a 3.5 million-word corpus – and his innovative web tools and collections based on the COCA. Once again, Davies’ Word and Phrase project website at Brigham Young University contains a bundle of powerfully linked resources, including a collocational thesaurus which links to other leading research resources such as the on-going lexical database project at Princeton, WordNet.

The open approach to developing non-commercial learning and teaching corpus-based resources in FLAX also shows the commitment to OER at OUCS (including the Oxford Text Archive), where the BAWE and the BNC research corpora are both managed. Click on the image below to visit the BAWE collections in FLAX.

BAWE case study text from the Life Sciences collection in FLAX with Wikipedia resources

Open eBooks for language learning and teaching

Learning Through Sharing: Open Resources, Open Practices, Open Communication, was the theme of the EuroCALL conference and to follow things up the organisers have released a call for OER in languages for the creation of an open eBook on the same theme. The book will be “a collection of case studies providing practical suggestions for the incorporation of Open Educational Resources (OER) and Practices (OEP), and Open Communication principles to the language classroom and to the initial and continuing development of language teachers.” This open-access e-Book, aimed at practitioners in secondary and tertiary education, will be freely available for download. If you’re interested in submitting a proposal to contribute to this electronic volume, please send in a case study proposal (maximum 500 words) by 15 October 2012 to the co-editors of the publication, Ana Beaven (University of Bologna, Italy), Anna Comas-Quinn (Open University, UK) and Barbara Sawhill (Oberlin College, USA).

MOOC on Open Translation tools and practices

Another learning event which I’ve just picked up from EuroCALL is a pilot Massive Open Online Course in open translation practices being run from the British Open University from 15th October to 7 December 2012 (8 weeks), with the accompanying course website opening on Oct 10th 2012. Visit the “Get involved” tab on the following site: “Open translation practices rely on crowd sourcing, and are used for translating open resources such as TED talks and Wikipedia articles, and also in global blogging and citizen media projects such as Global Voices. There are many tools to support Open Translation practices, from Google translation tools to online dictionaries like Wordreference, or translation workflow tools like Transifex.” Some of these tools and practices will be explored in the OT12 MOOC.

Bringing open corpus-based projects to the Open Education community

On the back of the Cambridge 2012 conference: Innovation and Impact – Openly Collaborating to Enhance Education held in April, I’ve been working on another eBook chapter on open corpus-based resources which will be launched very soon at the Open Education conference in Vancouver. The Cambridge 2012 event was jointly hosted in Cambridge, England by the Open Course Ware Consortium (OCWC) and SCORE. Presenting with Terri Edwards from Durham, we covered EAP student and teacher perceptions of training with open corpus-based resources from three projects: FLAX, the Lextutor and AntConc. These three projects vary in terms of openness and the type of resources they are offering. In future posts I will be looking at their work and the communities that form around their resources in more depth. The following video from the conference has captured our presentation and the ensuing discussion at this event to a non-specialist audience who are curious to know how open corpus-based resources can help with the open education vision. Embedding these tools and resources into online and distance education to support the growing number of learners worldwide who wish to access higher education, where the OER and most published research are in English, opens a whole new world of possibilities for open corpus-based resources and EAP practitioners working in this area.

A further video from a panel discussion which I contributed to – an OER kaleidoscope for languages – looks at three further open language resources projects that are currently underway and building momentum here in the UK: OpenLives, LORO, the CommunityCafe. Reference to other established OER projects for languages and the humanities including LanguageBox and the HumBox are also made in this talk.

A world declaration for OER

The World OER congress in June at the UNESCO headquarters in Paris marked ten years since the coining of the term OER in 2002 along with the formal adoption of an OER declaration (click on the image to see the declaration). I’ve included the following quotation from the OER declaration to provide a backdrop to this growing open education movement as it applies to language teaching and learning, highlighting that attribution for original work is commonplace with creative commons licensing.

Emphasizing that the term Open Educational Resources (OER) was coined at UNESCO’s 2002 Forum on OpenCourseWare and designates “teaching, learning and research materials in any medium, digital or otherwise, that reside in the public domain or have been released under an open license that permits no-cost access, use, adaptation and redistribution by others with no or limited restrictions. Open licensing is built within the existing framework of intellectual property rights as defined by relevant international conventions and respects the authorship of the work”.

Wikimedia – why not?

Wikimedia Foundation
Wikimedia Foundation

Earlier in September, I volunteered to present at the EduWiki conference in Leicester which was hosted by the Wikimedia UK chapter. Most people are familiar with Wikipedia which is the sixth most visited website in the world. It is but one of many sister projects managed by the Wikimedia Foundation, however, along with others such as Wikiversity, Wiktionary etc.

I will also be blogging soon about widely held misconceptions for uses of Wikipedia in EAP and EFL / ESL while exploring its potentials in writing instruction with reference to some very exciting education projects using Wikipedia around the world. The types of texts that make up Wikipedia alongside many academics’ realisations that they need to be reaching wider audiences with their work through more accessible modes of writing transmission are all issues I will be commenting on in this blog in the very near future.

Presenting the work the FLAX team have done with text mining, incorporating David Milne’s Wikipedia mining tool, the potential of Wikipedia as an open corpus resource in language learning and teaching is evident. I was demonstrating how this Wikipedia corpus has been linked to other research corpora in FLAX, namely the BNC and the BAWE, for the development of corpus-based OER for EFL / ESL and EAP. And, let’s not forget that it’s all for free!

The open approach to corpus resources development

There is no reason why the open approach taken by FLAX cannot be extended to build open corpus-based collections for learning and teaching other modern languages, linking different language versions of Wikipedia to relevant research corpora and resources in the target language. In particular, functionality in the FLAX collections that enable you to compare how language is used differently across a range of corpora, which are further supported by additional resources such as Wiktionary and Roget’s Thesaurus, make for a very powerful language resource. Crowd-sourcing corpus resources through open research and education practices and through the development of open infrastructure for managing and making these resources available is not as far off in the future as we might think. The Common Language Resources and Technology Infrastructure (CLARIN) mission in Europe is a leading success story in the direction currently being taken with corpus-based resources (read more about the recent workshop for CLARIN-D held in Leipzig, Germany).


Coxhead, A. (2000). The Academic Word List.


Radio Ga Ga album cover by Queen via Wikipedia

These past few months I’ve been tuning into a lot of different practitioner events and discussions across a range of educational communities which I feel are of relevance to English language education where uses for corpus-based resources are concerned. There’s something very distinct about the way these different communities are coming together and in the way they are sharing their ideas and outputs. In this post, I will liken their behaviour to different types of radio station broadcast, highlighting differences in communication style and the types of audience (and audience participation) they tend to attract.

I’ve also been re-setting my residential as well as my work stations. No longer at Durham University’s English Language Centre, I’m now London-based and have just set off on a whirlwind adventure for further open educational resources (OER) development and dissemination work with collaborators and stakeholders in a variety of locations around the world. TOETOE is going international and is now being hosted by Oxford University Computing Services (OUCS) in conjunction with the Higher Education Academy (HEA) and the Joint Information Systems Committee (JISC) as part of the UK government-funded OER International programme.

I will also be spreading the word about the newly formed Open Education Special Interest Group (OESIG), the Flexible Language Acquisition (FLAX) open corpus-based language resources project at the University of Waikato, and select research corpora, including the British National Corpus (BNC) and the British Academic Written English (BAWE) corpus, both managed by OUCS, which have been prised open by FLAX and TOETOE for uses in English as a Foreign Language (EFL) – also referred to as English as a Second Language (ESL) in North America – and English for Academic Purposes (EAP). Stay tuned to this blog in the coming months for more insights into open corpus-based English language resources and their uses in different teaching and learning contexts.

This post is what those in the blogging business refer to as a ‘cornerstone’ post as it includes many insights into the past few months of my teaching fellowship in OER with the Support Centre in Open Educational Resources (SCORE) at the Open University in the UK. Many posts within one as it were. This post also provides a road map for taking my project work forward while identifying shorter blogging themes for posts that will follow this one. This particular post will also act as the mother-ship TOETOE post from which subsequent satellite posts will be linked.  Please use the red menu hyperlinks in the section below to dip in and out of the four main sections of this blog post series. I have elected to choose this more reflective style of writing through blogging so that my growing understandings in this area are more accessible to unanticipated readers who may stumble upon this blog and hopefully make comments to help me refine my work. Two more formal case studies on my TOETOE project to date will be coming out soon via the HEA and the JISC.

I have also made this hyperlinked post (in five sections) available as a .pdf on Slideshare.

Which station(s) are you listening to?

BBC Radio has been going since 1927. With audiences in the UK, four stations in particular are firm favourites: youth oriented BBC Radio 1 featuring new and contemporary music; BBC Radio 2 with middle of the road music for the more mature audience; high culture and arts oriented BBC Radio 3, and; news and current affairs oriented BBC Radio 4. Of course there are many more stations but these four are very typical of those found around the world. What is more, I’ve selected these four very distinct stations as the basis to build a metaphor around the way four very distinct educational practitioner communities are intersecting with corpus-based language teaching resources. This metaphor will draw on thought waves from the following:

Radio 1 – what’s new and hip in open corpus-based resources and practices

Radio 2 – the greatest hits in ELT materials development and publishing

Radio 3 – research from teaching and language corpora

Radio 4 – The current talk in EAP: open platforms for defining practice

For every E it follows that there is an I, at least that’s how it would appear with Apple’s latest and much debated ‘free’ app hitting the mainstream and educational markets, iBooks Author. The only instance I can think of in the reverse is I, Claudius, first penned by Robert Graves, which now with an e-reader can be read as an e-I, Claudius.

When considering throwbacks to the analogue age, what exactly is it about e-books that make us (not forgetting publishers and hardware vendors) feel so at home with this type of packaging for content? Are e-books, and their close cousins the e-coursebooks, the great hand-holders as we make the transition from the semi-digital world of print media production toward a webbed bundle of digital content including dynamic RSS feeds, all of which can in effect be converted and customized into an e-book format? There is a very lively and timely open education seminar and collaborative e-book writing event going on right now within SCoPE (hosted by BCcampus in Canada), discussing the very nature of e-books. Writing an e-book about e-books for fun and no profit: February 1-14, 2012 is definitely worth checking out.

Similarly, within the ELT community, Scott Thornbury’s latest A-Z of ELT entry on e-coursebooks has created a lot of post-blogpost activity about the ‘need’ for coursebooks, digital or otherwise, in language teaching and learning. He offers an alternative 8-point ELT scenario for tapping into and toying with a mash-up of available technologies, both open and proprietary. Youtube is an endless resource provided you don’t live in a country or work in an institution where it is blocked, and this is where Apple’s iTunesU as an educational content channel wins the day again. To provide just one example of this success, The Open University in the UK has experienced 34 million downloads of their educational content on iTunesU since June 2008, much of which is open content released under creative commons licences.

I take Scott’s point that Tom Cobb’s Lextutor is an invaluable resource if you know how to use it and are willing to invest the time, as he suggests, to make the most of it in your learning and teaching. However, more in the way of training and the development of pedagogic wrappers for helping teachers and learners exploit corpus tools and resources effectively would not go amiss. I’ll be talking more along these lines in future posts.

Needless to say, this discussion on e-resources in the A-Z of ELT blog along with David Deubelbeiss’ call via EFL 2.0 Teacher Talk to disrupt ELT with more openness is what has inspired me to kick-start this blog – thanks, guys.

Pedagogic wrappers

Chinese spring roll wrappers, Burma Image via flickr creative commons

We have become too dependent on coursebooks and off-the-shelf dedicated resources for ELT. I’ve spent the better part of the last 10 years trying to deprogramme myself away from the ELT textbook consumer culture that I was formally trained into by Cambridge ESOL pre- and in-service teacher training courses. Yes, we could SARS – Select, Adapt, Reject, Supplement – (Graves, 2003) sections of a coursebook, as we were trained to do, but the coursebook still remains the crux of the lesson.

Anna Comas-Quinn of the LORO project (Languages Open Resources Online) talks of typical language teachers as being those who will beg, borrow and steal anything to teach a language point effectively. We’ve always done this to make our classes more interesting – taking a clip from a video here, chopping up a research article there – as we try to engage our students in authentic communication in the target language. So, in many ways we’ve always been at odds with the coursebook. But how often does our pedagogy, embedded in useful resources which we have painstakingly designed, remain locked in the secret garden that is our classroom? Or within the password-protected virtual learning environment of our institutions?

Our language teaching community would benefit greatly from the sharing of these resources and pedagogic wrappers in the form of lesson plans and tips for teaching. But what are the barriers to sharing when we’ve never been trained in intellectual property rights and the use of third party materials? If we had been trained in harvesting and harnessing open technologies and resources, then perhaps we would build resources from a different starting point, making it easier for us to share. We might even end up promoting ourselves and our institutions by releasing our open educational resources (OER) into the wild, a different business model worth exploring.

Image via flickr creative commons

By tapping into informal open education practitioner communities like those who hosted the recent Open Content Licensing for Educators 2012 (OCL4Ed, sponsored by Ako Aotearoa – New Zealand’s National Centre for Tertiary Teaching Excellence), attended online by 1067 people from 87 countries with 15,961 unique visits to the WikiEducator site, we can start to grow our skills and understanding in this important area of materials development and dissemination for free. The OCL4Ed materials were developed openly and collaboratively by dedicated volunteers from the OER Foundation, WikiEducator,  the OpenCourseware Consortium (OCWC) and Creative Commons with funding support from UNESCO. If you’d like to learn more about creative commons, check out this video here.

Nobody does it better…?

‘Bond’ image via flickr creative commons

Apple and Amazon are disrupting publishing and their pockets run very deep. Educational resources developers, many of whom are teachers, have always engaged in contracts with traditional publishers to pay for the costs of publishing in one form or another. With the launch of iBooks Author, Apple have their eyes on the K-12 market and this comes with its problems as is discussed here in the Scholarly Kitchen, a vibrant blog on educational publishing. David Crotty argues against Apple’s rush for rich media, stating that he’s perfectly happy to read an e-book without the bells and whistles of animations and embedded scenes from movies etc to pump up the text and the e-reader experience, as has been the case with the release of the popular and digitally-enhanced Alice for the iPad e-book published by Atomic Antelope.  He may not be so interested in the hype around adding movie clips and animations to text but language teachers are interested in drawing their students’ attention to differences in features of spoken and written discourse, and e-books offer us the potential to combine resources in this way.

Apple has pushed beyond the open ePUB format standards for e-books which don’t necessarily support such a high level of rich media, and have come up with their own ibooks file format instead. In many ways this push for richer media standards is admirable. But their EULA (End User Licence Agreement) doesn’t leave educational resources developers, many of whom are teachers, for both open and proprietary resources, much room to move by locking us down with a file format for use only on iPads and for iBook sales only through the iBooks Store.

By tuning into the OER community and by playing with and learning about different technologies and licensing standards, we may not always come up with e-resources that are as flashy as the high-end iBooks prepared by animation artists (although there are some animation artists floating about the OER world who would love to help!). We can, however, between us come up with pedagogically relevant e-resources that can be shared and re-used.


‘Pedagogic wrappers’ – term coined by Tom Browne, SCORE fellow with the Open University.

Graves, K, 2003. “Coursebooks.” In D. Nunan (Ed.) PracticalEnglish Language Teaching. New York: McGraw-Hill.