Many thanks to Mura Nava of EFLnotes for conducting and collating responses from these mini interviews in relation to the community corpus-based projects that will be part of the upcoming IATEFL event in Birmingham.

Following on from what could be described as a corpus carnival this year, some of those presenters kindly answered 5 questions. I list them in approximately chronological order:

Teaching the pragmatics of spoken requests in EAP
Christian Jones (University of Liverpool, UK)

Answering language questions from corpora (awaiting)
James Thomas (Masaryk University)

Using English Grammar Profile to improve curriculum design
Geraldine Mark (Gloucestershire College/Cambridge University Press) & Anne O’Keeffe (Mary Immaculate College, Limerick/Cambridge University Press)

Electronic theses online – developing domain-specific corpora from open access
Alannah Fitzgerald (Concordia University) & Chris Mansfield (Queen Mary University of London)

Guiding EAP learners to autonomously use online corpora: lessons learned
Daniel Ruelle (RMIT University Vietnam)

Christian Jones
1. Who are you?
I am a Senior Lecturer in Applied Linguistics and TESOL at the University of Liverpool.
2. Who should come to your talk?
EAP or EFL teachers interested in research into spoken language…

Building an open source business by Libby Levi licensed CC BY-SA


[This post originally appeared on the ELTjam blog.]

I was asked a question by an ELT materials writer at the BALEAP English for Academic Purposes conference earlier this year, along the lines of:

You’ve shown us a lot of openly licensed content that can be developed into English language learning materials, but what am I expected to do when my publisher asks me to write materials and then release some of them for free without pay? Even if I wanted to share and be more open in my practice, how can I afford to do this?

Good question. My answer here in this post is to look at both the ideas and the business models that are working within open education, and to build on discussions with the wider ELT community on ways to bring issues around access, copyright and materials writing/development to light. We are already seeing these issues played out in our informal online communities: the blogosphere, Twitter, Facebook, LinkedIn, and in webinars like the one coming up with the IATEFL Materials Writing SIG on copyright and images on November 7th.

Lofty Ideals and Lowly Deals

…you will find all kinds of ambitious proposals and interesting ideas, embedded in lofty ideals. Some of this is quite sensible; little of it is immediately operational. Then have a look at newspaper articles, watch the media, speak to people on the firing lines. Here you will find stories about all kinds of lowly deals, every one of them fully operational. (Mintzberg, 2015)

Publishing is changing dramatically and this is creating a veritable sea-change in education for social initiatives, such as we’ve seen with Free and Fair ELT. ELT materials writers, like many ELT teachers who develop teaching and learning materials, are enthused about sharing because sharing is at the heart of what we do as educators. And, because of the very global nature of ELT, we interface with the world and understand first-hand the imbalance in access to English language education, where English is the lingua franca in education, research and publishing.

This real-world need for English language education has, however, created a parallel reality with the wide-scale infringement of All Rights Reserved published ELT materials. Materials writers are sharing eye-opening stories of copyright infringement here on ELTjam (see here and here) and elsewhere about the  coursebook materials they’ve written, which also live a second life in .pdf format via various piracy pay-for sites. Many would like to see the big ELT publishers take a more responsible role in providing access to digital ELT materials for those informal learners who can’t afford the glossy print versions nor attend expensive language classes at well-resourced language institutes the world over that publishers have pegged as their primary market.

Informal online language learning is only going to continue to increase at a staggering rate as more of the world’s population comes online. However, I don’t believe the responsibility to recognise and engage with this growing informal English language learning community should fall solely on the shoulders of the individual materials writer or the individual language teacher, do you?

There are so many opportunities here for the big ‘charities’ in ELT such as the British Council and the big brand ELT publishers to refocus their social impact, which will, in turn, increase their branding power, through corporate social responsibility. Let’s face it, the British Council couldn’t make the profit it does without English language teachers and examiners (Phillipson, 2012), and ELT publishers are dependent on ELT materials writers in the same way that many publishing houses are dependent on academics. The Open Access movement wouldn’t have been as successful as it is today without a nudge from academics who took this movement into the mainstream with events like the Elsevier Boycott.

Open Business Models

‘There are none so blind’, the biblical saying goes, ‘as those who will not see’ … A mindset which couldn’t conceive of a non-hierarchical way of creating an authoritative reference work couldn’t take Wikipedia seriously. (Naughton, 2011).

John Naughton’s keynote address, The Elusive Technological Future, at the 2011 Association for Learning Technology conference, continues to be highly relevant today. Naughton critiques the recurring and dumbfounded view that we often hear in the media that the free technologies underpinning the likes of Wikipedia, Craigslist, Blackberry Messenger and Napster were all disruptive technologies that came out of nowhere. Naughton instead points to how certain establishments, and the mindsets that inhabit them, were not paying attention to these technologies and their somewhat informal communities (the great unwashed, as it were, to carry forward the biblical theme). They were not seen as a credible threat to established business models through simple lack of attention, and by the time these technologies and communities had become the mainstream, the old establishments had missed out on business opportunities of a lifetime.

Creating operational business models is very much on the agenda at Creative Commons, as evidenced in their recent call and successful crowdsourcing with Kickstarter for co-creating ‘a book that shows the world how sharing can be good for business’.

The open education movement recognises the copyright of creators (teachers, writers, developers) while it leverages innovative technologies and practices with teaching and learning materials so that they can be Redistributed, Reused, Repurposed and Remixed at scale to Redress the imbalance our world faces with access to education. Legally, this movement has become operational with the development of the Creative Commons suite of licences available to creators so that they can share their creations and specify how they want them to be reused.

This approach would appear to be out of balance, though, when we consider the many freelance ELT materials writers who are often caught in the middle and may be required by publishers to give away their copyright and even their work without pay as publishers experiment with new business models, including the freemium model. This is a very different business model, say, from that of the academic employed at a well-funded university where learning resources are created for on-site use, recorded and shared at scale via commercial platforms such as YouTube, iTunesU and with commercial MOOC providers such as Coursera and edX for creating access to learning for the masses, growing the online presence of expert educators, and promoting the brand of institutions.

Social Learning for Social Impact MOOC

I would like to invite anyone interested in this discussion to join the first ever Group-based MOOC, Social Learning for Social Impact with the Faculty of Management at McGill University in Canada and Edx, to collaborate and build upon the ethos of sharing ELT resources and raising awareness around copyright. I’m one of the volunteer facilitators on the MOOC. Since I have a background in ELT resources development and open education (with the open-source FLAX language project), I’d like to encourage you to share your experiences in ELT publishing and how this is impacting ELT materials development, and the way resources are being used and misused through copyright infringement.

The bullet points below are the stages of planning for social impact that we would be working through on the course. For example, the Free and Fair ELT initiative is currently growing social impact through social media and is successfully managing to scale this level of outreach. It would be great to discuss ways forward for taking this and similar initiatives in ELT resources outreach further with resourcing i.e. getting funder backing, and assessing the impact of these initiatives. The thought leader behind this MOOC, Henry Mintzberg, is well known for getting initiatives like Doctors Without Borders etc. off the ground with the following approach, which forms the structure of the MOOC:

  • Working as a high-functioning team (Co-Creating)
  • Learning your way to a prototype (Designing)
  • Growing your social impact (Scaling)
  • Finding resources to help sustain your efforts (Resourcing)
  • Discerning when and how to measure your impact (Assessing)

The MOOC starts Sept 16th to Dec 16th to form group-based discussions on a fortnightly basis, and you have up until October 14th to register. The expectation is that groups will connect via different types of social media platforms e.g. Facebook, Twitter, Skype, LinkedIn etc. but to bring the knowledge back to the MOOC platform to work through the stages of the course and share the different social initiatives across the different groups concerned with different social issues.

You can read more about Henry Mintzberg’s Rebalancing Society vision through this free pamphlet ebook, Rebalancing Society: Radical Renewal Beyond Left, Rights, and Centre 



While reading an article in the news today, Nine reasons only a tool would buy the Apple watch, it reminded me of something I’d come across last month. I’d almost fallen out of bed upon opening up my email that morning on seeing a forwarded post, originally from the British Association of Applied Linguistics (BAAL) discussion list it had made its way to the BALEAP discussion list. A marketing blurb for an English for Academic Purposes (EAP) publication compilation by Routledge at the staggering price of – wait for it – $1465 / £840. Annotated bibliographies can be useful but this four-volume compilation with an editor’s introduction to each volume seems to be worth more than its salt. But no comments of surprise from anyone on the BALEAP members-only d-list…life as normal it seems in the global north of well-resourced EAP teaching, learning and research…carry on regardless.

Good as gold, but stupid as mud
He’ll carry on regardless
They’ll bleed his heart ’til there’s no more blood
But carry on regardless

The Beautiful South – Good as Gold (Stupid as Mud)

But wait, before you hit that Recommend to Librarian tab on the Routledge website, doesn’t your library already subscribe to most of the journals on that solid gold EAP compilation list of titles that make up the four volumes? Let’s take a closer look at the journals that make up the bulk of this EAP compilation by Routledge. Well known tier 1 publications in the world of ELT: TESOL Quarterly (Wiley), Modern Language Journal (Wiley), Studies in Higher Education (Taylor & Francis – Routledge), Journal of Second Language Writing (Elsevier),  English for Specific Purposes (Elsevier, formerly the ESP Journal), Journal of English for Academic Purposes (Elsevier).

All of these commercial publishers have signed onto Green Open Access Self-Archiving policy – check the links above for information on how authors can self-archive peer-reviewed post publication versions of their articles with the already granted permissions of these journals. If an author has funding of a couple of thousand dollars she can opt for Gold Open Access with many of these same journals, whereby her article will be openly available via the journal in its final .pdf-formatted version. Very few tier 1 ELT journals offer what we refer to as Diamond Open Access where all articles are openly available at the time of publication without a fee. Language Learning & Technology is an excellent example of Diamond Open Access in ELT.

Let’s get back to Green Open Access as this is the main focus of this post. In these times of austerity even if you are fortunate enough to belong to a university with a healthy library budget chances are your librarian won’t see the point in providing further monetary kickback to the titles on the Routledge EAP list for journals your library already subscribes to. If your library can’t afford to subscribe to these journals, or you simply don’t have access to a library like many people the world over working in ELT, you’ll want to know that the policies around Green Open Access for authors to voluntarily self-archive their publications have been put in place for people exactly in your situation – to increase access to research in your field. A growing number of Green Open Access publications can be found on authors’ personal webpages, in institutional open scholarship repositories, in disciplinarian archives, and via research sharing sites like and ResearchGate.

There may in some cases be an embargo period of 1 to 2 years before authors can self-archive a postpub Open Access version of their articles but all of the journal titles on this Routledge list are well out of any embargo periods. Even the majority of books from commercial publishers in this compilation allow Open Access self-archiving of at least one chapter. And, sure, authors need to link to the publisher’s homepage and the DOI of the article or book, acknowledging the final published source.

The important point here is for authors to be savvy about copyright, and to negotiate in advance, if necessary, with publishers to ensure the right to publish articles on their personal websites, employee-affiliated sites or to distribute them individually via emails to interested readers.

How comfortable are the shoulders of ELT giants?

So, the onus here is on the author knowing her rights and her responsibilities. A win-win situation it would seem, where journal impact is positively correlated to an increase in number of citations – a genuine possibility with Open Access. Truth be told though, some ELT researchers do self-archive their papers and chapters following the affordances of Green Open Access policy but others just never seem to get round to it. Your guess is as good as mine as to why this is…

Isaac Newton via Wikiquote
Isaac Newton via Wikiquote

Spikes of Villainy via
Spikes of Villainy via

If I have seen further it is by standing on the shoulders of Giants – Isaac Newton.

Because new ideas must be situated  in relation to assimilated disciplinary knowledge, the most influential new ideas are often those that most closely follow the old ones – Ken Hyland. 

One Sri Lankan scholar tells the story of having to choose between writing his article submission by hand or on an ancient typewriter with a threadbare ribbon. He had paper only because he had bribed someone for it. EuroAmerican editors are rarely aware of the deep challenges facing scholars from countries outside of Europe and North America – Wendy Laura Belcher citing Suresh Canagarajah.

By opening access to our publications we also create the conditions of widening participation for our colleagues who are teaching and trying to do research in under-resourced contexts, primarily in the global south. The ELT world depends on this voluntary act of self-archiving from ELT researchers who publish in commercial journals to grow our field in ways that properly represent all the people involved with ELT. And, even though there are a growing number of free Open Access journal options for ELT researchers to publish with, which is great to see, we could still see a lot more in the way of self-archiving from the subscription-based ELT heavyweight titles that still tend to dominate and attract the big names in our field.

Thanks to David Wiley and his recent keynote address, Thoughts on Open at the SUNY COTE summit, for giving me this idea about the levels of discomfort experienced in trying to stand on the shoulders of giants in one’s field, whether we are talking about outputs from research or resources for learning and teaching.

I want my love, my joy, my laugh, my smile, my needs
Not in the star signs
Or the palm that she reads
I want my sun-drenched, wind-swept Ingrid Bergman kiss [replace kiss with Open Access – it kinda rhymes!]
Not in the next life
I want it in this
I want it in this

The Beautiful South – Good as Gold (Stupid as Mud)



The doge meme teaches us so much about language learning and how challenging it can be to accurately combine words and patterns when using another language. The FLAX language system teaches us so much about how we can avoid using dodgy language by employing powerful open-source language analysis tools and authentic language resources.

flaxHeader_leftlinkedup trophyThe FLAX (Flexible Language Acquisition) project has won the LinkedUp Vici Competition for tools and demos that use open or linked data for educational purposes. This post is the one I wrote to accompany our project submission to the LinkedUp challenge.

FLAX is an open-source software system designed to automate the production and delivery of interactive digital language collections. Exercise material comes from digital libraries (language corpora, web data, open access publications, open educational resources) for a virtually endless supply of authentic language learning in context. With simple interface designs, FLAX has been designed so that non-expert users — language teachers, language learners, subject specialists, instructional design and e-learning support teams — can build their own language collections.

The FLAX software can be freely downloaded to build language collections with any text-based content and supporting audio-visual material, for both online and classroom use. FLAX uses the Greenstone suite of open-source multilingual software for building and distributing digital library collections, which can be published on the Internet or on CD-ROM. Issued under the terms of the GNU General Public License, Greenstone is produced by the New Zealand Digital Library Project at the University of Waikato, and developed and distributed in cooperation with UNESCO and the Human Info NGO.


images_entries_entry_image_file_-_entry_id-4433_-_20111221124909164.w_420.h_280.m_crop.a_center.v_topAt FLAX we understand that content and data vary in terms of licensing restrictions, depending on the publishing strategies adopted by institutions for the usage of their content and data. FLAX has, therefore, been designed to offer a flexible open-source suite of linguistic support options for enhancing such content and data across both open and closed platforms.

Featuring the Latest in Artificial Intelligence &

Natural Language Processing Software Designs

Within the FLAX bag of tricks, we have the open-source Wikipedia Miner Toolkit, which links in related words, topics and definitions from Wikipedia and Wiktionary as can be seen below in the Learning Collocations collection  (click on the image to expand and visit the toolkit in action).

Wikipedia Mining Tool in FLAX Learning Collocations Collection – click on the image to expand and visit the collection

Featuring Open Data

Available on the FLAX website are completed collections and on-going collections development with registered users. Current research and development with the FLAX Law Collections is based entirely on open resources selected by language teachers and legal English researchers as shown in the table below. These collections demonstrate how users can build collections in FLAX according to their interests and needs.

Law Collections in FLAX


Type of Resource

Number and Source of Collection Resources

Open Access Law research articles
40 Articles (DOAJ – Directory of Open Access Journals, with Creative Commons licenses for the development of derivatives)
MOOC lecture transcripts and videos (streamed via YouTube and Vimeo)
4 MOOC Collections: English Common Law (University of London with Coursera), Age of Globalization (Texas at Austin with edX), Copyright Law (Harvard with edX), Environmental Politics and Law (OpenYale)
Podcast audio files and transcripts (OpenSpires)
15 Lectures (Oxford Law Faculty, Centre for Socio-Legal Studies and Department of Continuing Education)
PhD Law thesis writing
50-70 EThoS Theses (sections: abstracts, introductions, conclusions) at the British Library (Open Access but not licensed as Creative Commons – permission for reuse granted by participating Higher Education Institutions)
British Law Reports Corpus (BLaRC)
8.8 million-word corpus derived from free legal sources at the British and Irish Legal Information Institute (BAILII) aggregation website
FLAX Wikipedia English
Linking in a reformatted version of Wikipedia (English version), providing key terms and concepts as a powerful gloss resource for the Law Collections.
FLAX Learning Collocations
Linking in lexico-grammatical phrases from the British National Corpus (BNC) of 100 million words, the British Academic Written English corpus (BAWE) of 2500 pieces of assessed university student writing from across the disciplines, and the re-formatted Wikipedia corpus in English.
FLAX Web Phrases
Linking in a reformatted Google n-gram corpus (English version) containing 380 million five-word sequences drawn from a vocabulary of 145,000 words.

FLAX Training Videos

Featuring Game-based Activities

Click on the image below to explore the different activities that can be applied to language collections in FLAX.

FLAX Apps for AndroidAbout FLAX

We also have a suite of free game-based FLAX apps for Android devices. Now you can interact with the types of activities listed above while you’re learning on the move. Click on the FLAX app icon to the right to access and download the apps and enjoy!

 collocsmatchingapp  collocmatchingapp

FLAX Research & Development

oerresearchhubTo date, we have distributed the English Common Law and the Age of Globalization MOOC collections in FLAX to thousands of registered learners in over a 100 countries – wow!

A collaborative investigation is underway with FLAX and the Open Educational Resources Research Hub (OERRH), whereby a cluster of revised OER research hypotheses are currently being employed to evaluate the impact of developing and using open language collections in FLAX with informal MOOC learners as well as formal English language and translation students.

Educating Rita by Willy Russell, 1983

Current activity within open education can be characterised as having reached a beta phase of maturity. In much the same way that software progresses through a release life cycle, beta is the penultimate testing phase, after the initial alpha-testing phase, whereby the software is adopted beyond its original developer community.

Open education has now come to the attention of the mainstream press and traditional higher education, with the uptake of Open Educational Resources (OER) and with the advent of Massive Open Online Courses (MOOC). The participating masses can be likened to beta testers of these newly opened ways of educating. And, as with many recent software hits from Internet giants such as Google (e.g. Gmail), it is highly likely that open education will remain in a state of ‘perpetual beta’ development and testing, as we investigate and measure the impact of openness on education.

EPSON scanner image
Always in Beta by Tom Fishburne

Funded by the William and Flora Hewlett Foundation, the OER Research Hub (OERRH) is currently spear-heading the testing of OER hypotheses and is aggregating research findings through their OER Impact Map. The beta testing metaphor is also relevant to my research with the FLAX language project for the open development and testing of the FLAX Open Source Software (OSS). I have been promoting the FLAX OSS language system across different educational contexts (Fitzgerald, 2013), and I am now investigating user experiences of the software across multiple research sites in order to involve users in language collections building and further development of the OSS. I will be posting findings from this research on the TOETOE project blog throughout this year.

According to publisher and open source advocate, Tim O’Reilly:

Users must be treated as co-developers, in a reflection of open source development practices (even if the software in question is unlikely to be released under an open source license.) The open source dictum, ‘release early and release often‘, in fact has morphed into an even more radical position, ‘the perpetual beta’, in which the product is developed in the open, with new features slipstreamed in on a monthly, weekly, or even daily basis. It’s no accident that services such as Gmail, Google Maps, Flickr,, and the like may be expected to bear a ‘Beta’ logo for years at a time. (O’Reilly, 2005)

Open Fellowship with the OER Research Hub at the UK Open University

My first introduction to the UK Open University, henceforth referred to here as the OU, was when my Dad took me to see the film Educating Rita in 1985. It took two years to reach our picture house in provincial-town New Zealand, and I was just at that age – twelve going on thirteen – to appreciate this Pygmalion story of a woman breaking through the class barriers with an emancipatory distance education from the OU. My Dad also took me canvasing with him for the NZ Labour Party in those formative years, showing me first-hand that life for those in state-housing areas was very different from life in homes belonging to those who had been to university.

I never imagined that I’d be at the OU but I am now on my second fellowship here, this time as an Open Fellow with the OERRH based at the Institution of Educational Technology, and previously from 2011-2012 as a SCORE Fellow with the Support Centre for Open Resources in Education. When Rita’s character was a student at the OU in the early 1980s, open meant that admissions barriers had been removed from entry to formal study. This is still true today with the OU’s 200,000 registered paying students coming from a variety of traditional and non-traditional backgrounds. Nonetheless, this is still ground-breaking when we consider that most of the brick ‘n’ mortar higher education institutions of the world, including those with online learning offerings, still maintain strict admissions policies based on entrance examinations and prerequisites. Open has come to mean much more than this, however, with the rapid ascension of OERs and MOOCs. And, the OU have been no strangers to this rise in informal education as demonstrated in their longstanding work with the BBC through their Open Media Unit, and in leading a bevy of wide-reaching open education projects, including OpenLearn and now FutureLearn.

Open Education Awash with Venture Capital

Open has come of age it seems, with pathways to courses, the sharing of courseware code and access to research becoming increasingly free and open to learners; and with models for educational delivery and accreditation being experimented with on an almost daily basis by educators and institutions. Getting an education is one thing but coming up with sustainable and workable solutions for the world’s problems is increasingly understood as something outside of our reach and beyond the actual remit of education. While we discuss how to come up with the best business models for selling MOOCs and higher education to the masses, it might behoove us to ask how we can occupy eduction to evolve sustainable communities (human and non-human) on this planet rather than continue to commodify learning, teaching and research as products for an increasingly globalised world.

Weller’s position paper on the battle for open (2013) echoes concerns from open education advocates on the distortion of key principles for openness in education (see Wiley, 2013); as being sold downstream through the imposed economic value system of a booming online education market (Education Sector Factbook, 2012). The open-washing of the open education movement, in favour of capitalising on ‘open’ education at a massive scale, is being viewed in much the same way as green activists view the green-washing of the green movement, with our world’s most pressing environmental problems playing second fiddle to the big business of so-called green solutions:

When they start offering solutions is the exact moment when they stop telling the truth, inconvenient or otherwise. Google “global warming solutions.” The first paid sponsor,, urges “No doom and gloom!! When was the last time depression got you really motivated? We’re here to inspire realistic action steps and stories of success.” By “realistic” they don’t mean solutions that actually match the scale of the problem. They mean the usual consumer choices—cloth shopping bags, travel mugs, and misguided dietary advice—which will do exactly nothing to disrupt the troika of industrialization, capitalism, and patriarchy that is skinning the planet alive. But since these actions also won’t disrupt anyone’s life, they’re declared both realistic and a success. (Jenson, Keith & McBay, 2011)

Technology activists abound in support of the information wants to be free slogan from the 1960s. Information wants to be free. Information also wants to be expensive. …That tension will not go away” (Brand, 1987). Activism that is focused on the tension surrounding the freedom of information continues to grow, but what of activism that is directed at the tension between education wanting to be open and education wanting to be exclusive? Education wanting to be for life and education wanting to be for jobs only? When will we witness the scaling of massive buildings like the Shard in London by education activists – let’s call one of them Rita – in protest of formal education’s direct relationship with the limitations of commercialization? When will we raise the red flag on the global business of buying and selling education as an endgame in itself?

Subtitles errors: One climate reached the top by Gwydion M. Williams via Flickr

The purpose of education is going untested in real terms and the open education movement has only just begun educating in beta, as it were, by drawing on a pedagogy of abundance rather than a perceived pedagogy of scarcity (Weller, 2011). This shift in awareness and practice echoes Stewart Brand’s comments to Steve Wozniak, at the first Hackers’ Conference in 1984, on how information wants to be free due to the cost of getting digitised information out becoming lower and lower. The economics of learning materials (Thomas, 2014), following a recent discussion on the oer-discuss list about the progression from reusable learning objects to open educational resources, marks another useful distinction using Marxist terminology, between learning materials that have exchange versus use value:

In the discussions about whether content has value, there is often a question about whether content can be bought and sold, whether it is “monetisable”. In marxist economics that is the type of value called exchange value: where a commodity can be exchanged for money. There is another type of value: use value.  That is the extent to which a commodity is useful. It is about its utility, not its cost or price. I think most teaching resources can have a high use value both for primary use and secondary reuse, without that ever translating into an exchange value. They might be valuable but you can’t sell them. (Thomas, 2014)

It may be that Rita will draw on learning content and interactions from a variety of accessible places, including open publications and MOOCs, where ‘open’ equals free access only (for example, All Rights Reserved Coursera courses) rather than where open equals free plus legal rights to reuse, revise, remix and redistribute. It may also be that Rita will only begin to realise the use value of these educational resources – perhaps through joining Greenpeace or the Deep Green Resistance, for example – by synthesisng her contributions with those of her peers for the development of a learning community that is informal, networked and open. And, most importantly, where her developing awareness will actively challenge the perpetuation and escalation of global problems that are on a truly massive scale.

In critiquing open education, Audrey Watters, in her keynote address at the Open Education 2013 conference, also proposes communities rather than technology markets as the saviors of education:

Where in the stories we’re telling about the future of education are we seeing salvation? Why would we locate that in technology and not in humans, for example? Why would we locate that in markets and not in communities? What happens when we embrace a narrative about the end-times — about education crisis and education apocalypse? Who’s poised to take advantage of this crisis narrative? Why would we believe a gospel according to artificial intelligence, or according to Harvard Business School [Christensen’s Disruptive Innovation theory, 2013], or according to Techcrunch…? (Watters, 2013)



Cut out the middle man via frontbad sketchbook

“…the attempt to cut out the middleman as far as possible and to give the learner direct access to the data” (Johns, 1991, p.30)

Importance is placed on empirical data when taking a corpus-informed and data-driven approach to language learning and teaching. Moving away from subjective conclusions about language based on an individual’s internalized cognitive perception of language and the influence of generic language education resources, empirical data enable language teachers and learners to reach objective conclusions about specific language usage based on corpus analyses. Tim Johns coined the term Data-Driven Learning (DDL) in 1991 with reference to the use of corpus data and the application of corpus-based practices in language learning and teaching (Johns, 1991). The practice of DDL in language education was appropriated from computer science where language is treated as empirical data and where “every student is Sherlock Holmes”, investigating the uses of language to assist with their acquisition of the target language (Johns, 2002:108).

A review of the literature indicates that the practice of using corpora in language teaching and learning pre-dates the term DDL with work carried out by Peter Roe at Aston University in 1969 (McEnery & Wilson, 1997, p.12). Johns is also credited for having come up with the term English for Academic Purposes (Hyland, 2006). Johns’ oft quoted words about cutting out the middleman tell us more about his DDL vision for language learning; where teacher intuitions about language were put aside in favor of powerful text analysis tools that would provide learners with direct access to some of the most extensive language corpora available, the same corpora that lexicographers draw on for making dictionaries, to discover for themselves how the target language is used across a variety of authentic communication contexts. As with many brilliant visions for impactful educational change, however, his also appears to have come before its time.

This post will argue that the original middleman in Johns’ DDL metaphor took on new forms beyond that of teachers getting in the way of learners having direct access to language as data. An argument will be put forward to claim that the applied corpus linguistics research and development community introduced new and additional barriers to the widespread adoption of DDL in mainstream language education. Albeit well intentioned and no doubt defined by restrictions in research and development practices along the way, new middlemen were paradoxically perpetuated by the proponents of DDL making theirs an exclusive rather than a popular sport with language learners and

The middle man comic – first issue cover via Wikipedia

teachers (Tribble, 2012). And, with each new wave of research and development in applied corpus linguistics new and puzzling restrictions confronted the language teaching and learning community.

The middleman in DDL has presented himself as a sophisticated corpus authority in the form of research and development outputs, including text analysis software designed by, and for, the expert corpus user with complex options for search refinement that befuddled the non-expert corpus user, namely language teachers and learners. Replication of these same research methods to obtain the same or similar results for uses in language teaching and learning has often been restricted to securing access to the exact same software and know-how for manipulating and querying linguistic data successfully.

Which language are you speaking?

He has been known to speak in programming languages with his interfaces often requiring specialist trainers to communicate his most simple functions. Even his most widely known KWIC (Key Word In Context) interface for linguistic data presentation with strings of search terms embedded in truncated language context snippets remain foreign-looking to the mostly uninitiated in language teaching and learning. In many cases, he has not come cheap either and requirements for costly subscriptions to and upgrades of his proprietary soft wares have been the norm, especially in the earlier days.

In particular, with reference to English Language Teaching (ELT), he has criticized many widely used ELT course book publications and their language offerings for ignoring his research findings based on evidence for how the English language is actually used across different contexts of use. In response, a few ELT course book publishers have clamored around him to help him get his words out for a price but in so doing have rendered his corpus analyses invisible, in turn creating even more of a dependency on course books rather than stimulating autonomy among language teachers and learners in the use of corpora and text analysis tools for DDL. And, because publishers were primarily confining him to the course book and sometimes CD-ROM format there were only so many language examples from the target corpora that could possibly fit between the covers of a book and only the most frequent language items made it onto the compact disc.

The Oxford Collocation Dictionary for Students of English, (2nd Edition from 2009 by Oxford University Press) based on the British National Corpus (BNC) is one example where high frequency collocations for very basic words like any and new predominate and where licensing restrictions permit only one computer installation per CD ROM. Further restrictions compound the openness issue with the use of closed corpora in leading corpus-derived ELT books such as the Cambridge University Press (CUP) publication, From Corpus to Classroom (O’Keeffe, McCarthy & Carter, 2007), which might have been more aptly entitled, From Corpus to Book, as it draws heavily on the closed Cambridge and Nottingham Discourse Corpus of English (CANCODE) from Cambridge University Press and Nottingham University and recommends the use of proprietary concordancing programs, Wordsmith Tools and MonoConc Pro, thereby rendering any replication of analyses for the said corpus inaccessible to its readers.

Mainstream language teacher training bodies continue to sidestep the DDL middleman in the development of their core training curricula (for example, the Cambridge ESOL exams) due to the problems he proposes with accessibility in terms of cost and complexity. Instead, English language teacher training remains steadily focused on how to select and exploit corpus-derived dictionaries with reference to training learners in how to identify, for example: definitions, derivatives, parts of speech, frequency, collocations and sample sentences. In the same way that corpus-derived course books do not render corpus analyses transparent to their users, training in dictionary use does not bring teachers and their learners any closer to the corpora they are derived from.

Cambridge English Corpus

registered-blogger-150x150-bannerMichael McCarthy presented, ‘Corpora and the advanced level: problems and prospects’ at IATEFL Liverpool 2013. One of the key take-away messages from his talk was the fact that learners of more advanced English receive little in the way of return on investment once the highest frequency items of English vocabulary had been acquired (he referred to the top 2000 words from the first wordlist of the British National Corpus that make up about 80% of standard English use). To learn the subsequent wordlists of 2000 words each the percentage of frequency in usage drops considerably, so in terms of cost for the time and money you might end up spending if you sign up to yet more English language classes may not be affordable or feasible. This has particular implications in learning English for Specific Purposes (ESP), including English for Academic Purposes (EAP) which many would argue is always concerned with developing specific academic English language knowledge and usage within specific academic discourse communities.

Catching Michael McCarthy on the way out of the presentation theatre he kindly agreed to walk and talk while rushing to catch his train out of Liverpool. Would the Cambridge English Corpus be made available anytime soon for non-commercial educational research and materials development purposes, I asked? I hastened to add the possibilities and the real world need for promoting corpus-based resources and practices in open and distance online education as well as in traditional classroom-based language education. He agreed that the technology had become a lot better for finally realising DDL within mainstream language teaching and learning and within materials development. Taking concordance line printouts into ELT classrooms had never really taken off in his estimation and I would have to agree with him on that point. He indicated that it would be unlikely for the corpus to become openly available anytime in the foreseeable future, however, due to the large amount of private investment in the development of the corpus with restricted access for those participating stakeholders on the project only.

But what would the real risk be in opening up this corpus to further educational research and development for non-commercial purposes with derivative resources made freely available online? Wouldn’t this be giving the corpus resource added sustainability with new lives and further opportunities for exploitation that could advance our shared understanding of how English works? –  across different contexts, using current and high quality examples of language in context? More importantly, wouldn’t this give more software developers the chance to build more interfaces using the latest technology, and for more ELT materials developers, including language teachers, the chance to show different derivative resource possibilities for effectively using the corpus in language teaching and learning?

A non-commercial educational purpose only stipulation could be used in all of the above resource development scenarios. Indeed, these could all be linked back to the Cambridge English Corpus project website as evidence of the wider social and educational impact as a result of their initial investment. This is what will be happening with most of the publicly funded research projects in the UK following recommendations from the Finch report which come into effect in April 2014. It follows that Open Educational Resources (OER) and Open Educational teaching Practices (OEP) will allow for expertise to be readily available when Open Access research publishing is compulsory for all RCUK and EPSRC funding grants for the development of research-driven open teaching and learning derivatives. Privately funded research projects like this one from CUP could also be leading in this area of open access.

Corpora such as the British National Corpus (BNC), the British Academic Written English (BAWE) corpus, Wikipedia and Google linguistic data as a corpus are some of the many valuable resources that have all been developed into language learning and teaching resources that are openly available on the web. In the following sections, I will refer to leading applied corpus linguistics research and development outputs from leading researchers who have been making their wares freely available if not openly re-purposeable to other developers, as in the example of the FLAX language project’s Open Source Software (OSS). And, hopefully these corpus-based resources are getting easier to access for the non-expert corpus user.

“For the time being” CUP are providing free access to the English Vocabulary Profile website of resources based on the Cambridge English Corpus (formerly known as the Cambridge International Corpus), “the British National Corpus and the Cambridge Learner Corpus, together with other sources, including the Cambridge ESOL vocabulary lists and classroom materials.” Below is a training video resource from CUP available on YouTube, which highlights some of the uses for these freely available resources in language learning, teaching and materials development. This is a very useful step for CUP to be taking with making corpus-based resources and practices more accessible to the mainstream ELT community.

Open practices in applied corpus linguistics

goaheadcutoutmiddlemanEnter those applied corpus linguistics researchers and developers who have made some if not all of their text analysis tools and Part-Of-Speech-tagged corpora freely accessible via the Web to anyone who is interested in exploring how to use them in their research, teaching or independent language learning. Well-known web-based projects include Tom Cobb’s resource-rich Lextutor site, Mark Davies’ BYU-BNC (Brigham Young University – British National Corpus) concordancer interface and the Corpus of Contemporary American English (COCA) with WordandPhrase (with WordandPhrase training videos resources on YouTube) for general English and English for Academic Purposes (EAP), Laurence Anthony’s AntConc concordancing freeware for Do-It-Yourself (DIY) corpus building (with AntConc training video resources on YouTube), and the Sketch Engine by Lexical Computing which offers some open resources for DDL. Open invitations from the Lextutor and AntConc project developers seeking input on the design, development and evaluation of existing and proposed project tools and resources are made by way of social networking sites, the Lextutor Facebook group and the AntConc Google groups discussion list. Responses usually come from a steady number of DDL ‘geeks’, however, namely those who have reached a level of competence and confidence with discussing the tools and resources therein. And, most of those actively participating in these social networking sites are also engaging in corpus-based research.

Data-Driven Learning for the masses?

My own presentation at IATEFL Liverpool was based on my most recent project with the University of Oxford IT Services for providing and promoting OSS interfaces from the FLAX language project for increasing access to the BNC and BAWE corpora, both managed by Oxford. In addition to this, the same OSS developed by FLAX has been simplified with the development of easy-to-use interfaces for enabling language teachers to build their own open language collections for the web. Such collections using OER from Oxford lecture podcasts, which have been licensed as creative commons content, have also been demonstrated by the TOETOE International project (Fitzgerald, 2013).

The following two videos from the FLAX language collections show their OSS for using corpus-based resources in ELT that are accessible both in terms of simplicity and in terms of openness. The first training video demonstrates the Web as corpus and how this resource has been effectively mined and linked to the BNC for enhancement of both corpora for uses in DDL. The second training video demonstrates how to build your own Do-It-Yourself corpora using the FLAX OSS and Oxford OER. With open corpus-based resources the reality of DIY corpora is becoming increasingly possible in DDL research and teaching and learning practice (Charles, 2012; Fitzgerald, in press).

So, go ahead, and cut out the middleman in data-driven learning.

FLAX Web Collections (derived from Google linguistic data):

The Web Phrases and Web Collocations collections in FLAX are based on another extensive corpus of English derived from Google linguistic data. In particular, the Web Phrases collection allows you to identify problematic phrasing in writing by fine-tuning words that precede and follow phrases that you would like to use in your writing by drawing on this large database of English from Google. This allows you to substitute any awkward phrasing with naturally occurring phrases from the collection to improve the structure and the fluency of writing.


FLAX Do-It-Yourself Podcast Corpora – Part One:

Learn how to build powerful open language collections through this training video demonstration. Featuring audio and video podcast corpora using the FLAX Language tools and open educational resources (OER) from the OpenSpires project at the University of Oxford and TED Talks.



Association for Distance Education in Brazil

This is the eighth and final post in a blog series based on the the TOETOE International project with the University of Oxford, the UK Higher Education Academy (HEA) and the Joint Information Systems Committee (JISC). I have also made this post in the Open Educational Practices (OEP) series available as a .pdf on Slideshare.

São Paulo is what is known as an alpha world city, an important node within the global economy. From all accounts it is also the hub of Open Educational Resources (OER) in Brazil. In February 2013, I gave a workshop presentation organized by the Brazilian Association of Distance Education (ABED), which was simultaneously translated from English into Portuguese.

Brazilian Association of Distance Education (ABED)

ABED is a not-for-profit learned society that promotes the dissemination of flexible, open and distance education; founded in 1995 it currently has around 3,000 members, both individual and institutional.  On their website, there is a designated ‘referatory’ where you will find a listing of some 30 repositories of OER in the Portuguese language, serving a wide range of educational levels, from K-12 to continuing education. “Yet, for a country as large as Brazil (population almost 200 million) and the language group Brazil belongs to (250 million), we are terribly far behind in the area of OER”. – Fredric Litto, Chairman of ABED.

“ABED fulfils its mission by contributing as a national forum for discussion and presentation of studies and research related to Brazil. Obtaining, organising and disseminating quantitative information and presenting qualitative data analyses, in reference to the direction of education and distance learning, comprises the technical interests of ABED in providing a compass that indicates where we are in the practice of this teaching modality, allowing a glimpse of some of its trends for the future. Furthermore, by making available the quantitative data gathered, other researchers and people interested in distance learning have the opportunity to provide their own analyses and inferences.” (ABED, 2012).

In a meeting with Renato Bulcao and Bruna Medeiros at the ABED headquarters, we went over the founding principles of their work for promoting and advancing open and distance education in Brazil, along with a discussion on the potential development of OER in English and Portuguese with the TOETOE and FLAX projects:

Alannah: And, so ABED is a government-funded initiative?

Renato: No, it’s a private academic association. One of the few in Brazil because we don’t have this kind of association all over the place.

Bruna: Right. It’s like you know, we have profit but we’re not a commercial body, so you know, there’s no money around. We get some money from our affiliated associate members but it doesn’t come to us. We try to help. Distance education in Brazil is like, how can I say it? [Talks in Portuguese to Renato] Yeah, like old fashioned. So, we’re trying to progress everything.

Alannah: So, you’re an umbrella organization trying to communicate everything  related to open and distance education? Because when I looked for you, I found you with…

Bruna: The OCW, right?

Alannah: Right, the OCW. On their website, it said you were the hub of OER in Brazil and I was so glad when you wrote back.

Renato: It’s true, we are the hub in Brazil, at least for the next five years.

Alannah: You must be very busy.

Bruna: Yeah, we usually have conferences three times a year. But this year we’re going to have two with one on the virtual learner in June. It’s really nice because we’ve had policy related ones before.

Renato: Tell us please about today.

Bruna: OK, about the workshop, I set up everything. We invited all the teachers, professionals, students who would be interested in learning about OER. I didn’t direct this ony at English teachers, so it’s just like, you know, broardly appealing for everyone. I even opened it up for Italian institutions..

Alannah: Oh, good. The software is flexible but it’s just that we’ve built collections in English. There’s no reason why we can’t build resource collections in other languages as well. If anyone wants to build open language collections in Portuguese that would be wonderful. It’s just that English collections are the ones that we have prepared with the Oxford OER but the software is multilingual so it would be great if we could get some Brazilian OER specialists building Portuguese collections and not just collections in English.

Bruna: Oh, that’s nice. We’ll have simultaneous translation today from English to Portuguese and Portuguese to English, so you know it’ll be fine.

Social Services for Industry (SESI – Serviço Social da Indústria)

Mara Ewbank, a representative from the Brazilian Social Services for Industry (SESI – Serviço Social da Indústria in Portuguese) was in attendance at my workshop and we have stayed in contact with plans for building English and possibly Portuguese collections based on their middle and high school curricula with the FLAX OSS for developing OER collections that would serve around 18,000 students in10 different municipalities across the São Paulo region. SESI is a private not-for-profit institution that operates throughout Brazil’s 26 states including the Federal District (Distrito Federal); initially set up in July 1946 by president Eurico Gaspar Dutra with the aim of “promoting social welfare, cultural development and improving the lives of workers and their families and the communities they live in.” This was in response to the introduction of new labour laws that had been established by Getúlio Vargas, who preceded Dutra and created the Consolidation of Labor Laws (CLT – Consolidação das Leis do Trabalho in Portuguese). (Wikipedia, 2013).

Recursos Educacionais Abertos (REA)

The Recursos Educacionais Abertos (REA, which translates to Open Educational Resources), one of the most active OER bodies in Brazil, was also in attendance at my presentation and they have blogged about the event on their website.  To give an indication of just how important Brazil’s richest state is to OER, during my stay in Brazil it was announced that governor Geraldo Alckmin of  São Paulo had vetoed in its entirety the proposed public policy OER bill (PL 989/2011) that had been passed by all committees of the São Paulo Legislative Chamber back in December 2012. The reason given for vetoing the bill was a perceived conflict of interest between the Executive and Legislative branches of government. This has been viewed as an extreme blow to OER efforts in São Paulo for the realisation of OER for democratising education in Brazil. A decree to overturn the decision is being sought by the Brazilian OER community, headed by the REA:

We are conscious that we have lost a battle, but we are sure we have not lost the war. We will succeed in developing a more innovative and inclusionary education system, inspired by the developments of the information society. We have mobilized folks around Brazil, meetings are happening, and for now the press is on our side. In practical terms, our next steps are to partner and pressure with the Governor to enact the Bill in the form of a Decree.” (Rossini, Gonsales and Sebriam, 2013). 


OpenSpires OER project at the University of Oxford

This is the seventh post in a blog series based on the the TOETOE International project with the University of Oxford, the UK Higher Education Academy (HEA) and the Joint Information Systems Committee (JISC). I have also made this post in the Open Educational Practices (OEP) series available as a .pdf on Slideshare.

Do-It-Yourself Corpora

Standard industry tools in corpus linguistics for doing translation, summarisation, extraction of information, and the formatting of data for analysis in linguistic software programs were generally what was needed before one could get started with building a corpus. It is safe to say that language teachers and many researchers who do not have a background in computer science will never have the time or the interest in these processes. This is why simple interface designs like those in the FLAX language project that have been designed for the non-expert corpus user, namely language teachers and learners, are enabling teaching practitioners to be part of the language collections building process.

Stable open source software (OSS) has been designed to enable non-corpus specialists to build their own language collections consisting of text and audio-visual content that benefit from powerful text analysis tools and resources in FLAX. These collections can be hosted directly on the FLAX website under the registered users section or the OSS can be hosted on the users’ preferred website or content management system. A Moodle version of the FLAX tools has also been developed and new tools and interactive games are currently in the beta development stage for stable release later this year in 2013.

This post from the TOETOE International project includes links to two training videos for building do-it-yourself (DIY) podcast corpora as can be seen below.  These demonstrate new OSS tools and interfaces from FLAX for developing interactive open language collections, based on creative commons resources from the Oxford OpenSpires project and a TED Talk given by Oxford academic, Ian Goldin. These training videos and others in the FLAX series from this project will be promoted via Russell Stannard’s Teacher Training Videos (TTV) site to reach wider international audiences including those who do not have access to YouTube. Further plans for the re-use of resource outputs from this project include the translation of the FLAX training videos into Chinese, Vietnamese, and Portuguese. And, later in 2013, the FLAX project will be releasing further OSS for enabling teachers to build more interaction into the development of DIY open language collections.

FLAX Do-It-Yourself (DIY) Podcast Corpora with Oxford OER part one

Learn how to build powerful open language collections through this training video demonstration. Featuring audio and video podcast corpora using the FLAX Language tools and open educational resources (OER) from the OpenSpires project at the University of Oxford and TED Talks.

FLAX Do-It-Yourself (DIY) Podcast Corpora with Oxford OER part two

Continue to learn how to make powerful open language collections and how to build interactivity into those collections with a wide variety of automated interactive language learning tasks through this demonstration training video. Featuring audio and video podcast corpora, using the FLAX Language tools and open educational resources (OER) from the OpenSpires project at the University of Oxford and TED Talks.

It is anticipated that these open tools and resources will provide simple and replicable pathways for other higher education institutions to develop language support collections around their own OER podcasts for wider uptake and accessibility with international audiences. The training videos demonstrate how a variety of activities have also been built into the FLAX OSS for enabling teachers to manipulate texts within the collections to create language-learning interaction with the open podcast content. The following slideshow from the 2013 eLearning Symposium with the Centre for Languages, Linguistics, and Area Studies (LLAS) at the University of Southampton shows the interactivity that can be built into the DIY corpora with FLAX. It also highlights how corpus-based resources and Data-Driven Learning did not feature at the recent BALEAP Professional Issues Meeting on Blending EAP with Technology at Southampton in the A-Z of Technology in EAP that was later compiled by the event organisers. This points to a lack of awareness around corpus-based resources in EAP where there have been no studies conducted on the user interface designs of most concordancing software for usability in mainstream language education as well as highlighting the lack of comprehensive research on technology in EAP.

TED (Ideas worth Spreading) encourages the re-use of their creative commons content for non-commercial educational purposes and many stakeholders have engaged in the re-use of TED Talks and YouTube with the TED-Ed programme. However, adding value to an open resource can also result in the decision by ELT materials developers to create a paywall around the support resource as can be seen below in the English Attack language learning software interface for TED Talks, free movie trailers etc. Perhaps this says something about the industry of ELT which views OER as yet more resources to make money from – high quality accessible resources no less that have been expressly released for sharing and the promotion of understanding…
The English Attack pay-for version of re-use with TED Talk Creative Commons content



BAWE case study from the Life Sciences collection in FLAX showing links to Wikipedia resources

This is the sixth post in a blog series based on the the TOETOE International project with the University of Oxford, the UK Higher Education Academy (HEA) and the Joint Information Systems Committee (JISC). I have also made this post in the OEP series available as a .pdf on Slideshare.

FLAX British Academic Written English (BAWE) collections

The BAWE collections in FLAX, as demonstrated in the training video below, enable you to interact with the BAWE corpus of university student writing from across the disciplines to learn about the thirteen different genres assigned by the makers of the corpus (Nesi, Gardner, Thompson & Wickens, 2007). For free access to the complete manual on the making of the BAWE by Heuboeck, Holmes and Nesi, 2010) you can access it from the following link (The BAWE Corpus Manual, An Investigation of Genres of Assessed Writing in British Higher Education). Features from the FLAX open source software (OSS) project for understanding the BAWE, include: word lists and keyness indicators; collocations; lexical bundles; a glossary function with Wikipedia; along with a variety of automated functions for searching, saving and linking within the BAWE corpus.

From its earliest inception the FLAX project has been envisioned and advanced with the language teacher and learner in mind. Since 2008, I have been engaged with the FLAX project to provide user feedback on the development of the language reference collections and to devise ways to promote the project resources within mainstream English language teaching and learning communities. A simplified and intuitive interface has been developed for presenting language collections and interactive learning activities based on the powerful and complex handling of search queries from a range of linked corpora and open linguistic content.

Another open web-based interface for accessing the BAWE is located within the commercial Sketch Engine project. This project provides the more traditional KWIC (KeyWord In Context) concordancer interface for linguistic data presentation with strings of search terms embedded in truncated language context snippets. The Using Sketch Engine with BAWE manual (Nesi & Thompson, 2011) provides an in-depth user guide for the more expert corpus user.

Sketch Engine open concordancer interface for the BAWE showing results for a KWIC query for the item ‘research’.

The Word Tree corpus interface is a JISC Rapid Innovation project based at Coventry University providing yet another open web-based interface alternative to KWIC searches for analysing the BAWE. One of the project’s goals is for the open sourcecode that has been developed for this rapid innovation project to be re-used in further open corpus-based projects for analysing additional corpora which is available from github. This project can be followed via the Word Tree project blog and JISC final report, outlining issues encountered with managing and processing the presentation of large amounts of linguistic data through a word tree interface that provides click through pathways and the ability to prune and graft word tree searches.

The Word Tree corpus interface for the BAWE showing a search query word tree for the items ‘research’ and ‘research methods’

Reference corpora versus specialist corpora

Comparisons made between language as it is used in reference corpora, such as the British National Corpus (BNC) which provides a snapshot of how English occurs across a variety of contexts, and how it is used in specialist academic sub-corpora, or in actual student-generated academic text corpora as in the case of the BAWE, help us to identify which words and phrases occur more commonly in specific as well as in general academic contexts of use. Not confined by the boundaries of a printed volume, the openly available web-based BAWE collections in FLAX (demonstrated in the video above) are arguably more powerful than the average dictionary or coursebook for practice with academic English.

Before commencing on my journeys with the TOETOE international, I had written an extensive project blog post on open trends within corpora and ELT materials development in Radio Ga Ga: corpus-based resources, you’ve yet to have your finest hour. At the Open Education conference in Vancouver in October 2012, with my presentation on the Great Beyond with Open ELT Resources (see below) I had outlined the development work that TOETOE and the FLAX team were going to embark on with respects to the BAWE corpus and the evaluations on the earlier BAWE collections in FLAX that we would be seeking from international participants in collaboration with the project. Feedback from international stakeholders in China (Confucian dynamism in Chinese ELT context) and Korea (the English language skyline in South Korea) on the BAWE collections in FLAX led to further design and development iterations while back in New Zealand with the FLAX team (Love is a stranger in an open car to tempt you in and drive you far away…toward open educational practice) which have been captured in the project blog posts here in brackets.

Earlier in 2012 FLAX had developed the wikify function for matching key words and phrases in the BAWE collections to Wikipedia entries as a glossary support feature. This provides help with subject specific language in the BAWE which may be daunting to learners and teachers alike who are not yet familiar with the specific language of a given topic area but where there is an expectation that learners will need to develop proficiencies with specific academic English if they are to engage in English-medium higher education programmes. For example, the technical language from a biology methodology recount text in the BAWE can be glossed for enhanced understanding in FLAX with links to Wikipedia definitions and related topics.

Corpus-based approaches for understanding genre in EAP

“Unsurprisingly, the utility of the corpus is increased when it has been annotated, making it no longer a body of text where linguistic information is implicitly present, but one which may be considered a repository of linguistic information.” (ICT4ELT McEnery  & Wilson, 2012)

Corpus studies help with investigations into understanding more than just discrete language items. The study of genres as different communities of practice develop them is also central to corpus work for better understanding the different written assessment types that students will actually encounter across the academy. Generic EAP writing assessments, especially those found in College Composition and Writing Across the Curriculum programmes (Freedman; Petraglia, 1995; Russell, 2002), have been criticized for becoming genres unto themselves; with serious doubts cast on their ability to resemble or assist with transfer in the multitude of specific genres that students will be expected to engage with in their different academic programmes. Generic EAP teaching resources and writing assignments that teach general things about academic language and writing have resulted in EAP writing that Wardle describes as conforming to ‘mutt genres’ (2009).

In response to the issue of genre in university writing, the BAWE corpus collections in FLAX provide EAP teachers and students with a first-hand look into this student-generated corpus of assessed undergraduate and taught postgraduate writing collected at three UK universities: Warwick, Oxford Brookes and Reading. Thirteen different genres were assigned by the developers of the BAWE (Nesi et al., 2004-2007), as can be seen below (hyperlinks to the Life Sciences sub-corpus of the BAWE collections in FLAX):

The Oxford Text Archive where the BAWE is managed by the University of Oxford IT Services granted access to the FLAX project to develop OSS for language learning and teaching on top of this valuable research corpus, in the same way that FLAX have developed OSS to enable access to the BNC which is also managed and distributed by OU IT Services. Four sub-corpora have been developed in FLAX as they correspond to written academic assessments across the major academic disciplines as identified by the makers of the BAWE, including: the Physical Sciences, the Life Sciences, the Social Sciences and the Arts and Humanities BAWE collections in FLAX. It was determined that student texts from the BAWE would serve as an achievable model for academic writing for EAP students, and that this corpus of student texts would serve as a starting point if linked to wider resources, namely the BNC, Wikipedia, the Learning Collocations collection in FLAX and the live Web, thereby providing a ‘bridge’ to more expert writing.

The developers of the BAWE corpus have a follow-on ERSC-funded project, Writing for a Purpose, which are learning resources based on the BAWE for enhancing understanding of genre for writing across the disciplines. These resources are going to be promoted at the upcoming 2013 IATEFL and BALEAP conferences and will definitely be something to look out for.



English, the Dalit Goddess by Shant Kala Niketan

“The West has today opened its door. There are treasures for us to take. We will take and we will also give, From the open shores of India’s immense humanity.”

(Extract from the poem Gitanjali or Offerings by Rabindranath Tagore, 1910)


This is the fifth post in a blog series based on the the TOETOE International project with the University of Oxford, the UK Higher Education Academy (HEA) and the Joint Information Systems Committee (JISC). I have also made this post in the OEP series available as a .pdf on Slideshare:

While driving back from a half-day tour of Delhi my taxi driver struck up a conversation about work and family. He had improved his basic level of English through his work and liked to practise with tourists. I was meeting with Professor L.P of the Rajasthan Ministry of Education that Sunday evening and told the taxi driver that I worked in education to promote free and open resources for teaching and learning English. “Good, great, …we don’t have a computer but we use our phones”. This made me think of the English in Action project in Bangladesh with UK Aid and the UK Open University for delivering English language learning resources via mobile phones. He went on to tell me that family life was very difficult for him, only visiting his wife and teenage son once a month in their home village while he worked long days as a driver in Delhi trying to earn enough for the family and so that his son could attend private English lessons, all the time stressing that he was a sixth-level person. Would there be any help from the government with his son’s English classes, I asked. “No, Ma’am… no, Ma’am”.

E-learning emancipatory English

I had met Dr. L.P. Mahawar  at the EuroCALL Conference on OER in Bologna in 2012. Just before arriving in India he had posted an upcoming conference in the EuroCALL forum to be held at Jaipur National University, E-learning Emancipatory English: Fast Forwarding the Future, in collaboration with SAADA (Society for Analysis, Dialogue, Application and Action) of which he is also a member. Covering topics such as: English as a symbol of status and a tool for emancipation; different Englishes evolving in the contemporary world; different pedagogical approaches to English Language Teaching; the role of the mother tongue in ESL/EFL; and English for Specific and Academic Purposes – naturally, I wanted to be part of this although my dates for India and the conference didn’t quite work out. So, I emailed him and said I’d like to contribute a presentation by distance and he replied positively, suggesting that we also meet while I was in Delhi. I was keen to find out more about OER and emancipatory English in the Indian context.

In my interview with Dr. L.P. Mahawar, he pointed to other overriding social issues currently impacting the Dalit’s and other low socio-economic groups from succeeding in education and beyond, identifying: high truancy among teachers and students; high drop out rates among students; skewed educational goals in favour of cram examinations; and a lack of e-connectivity at schools and in homes. Many of the problems identified in my interview with Dr. Mahawar are reflected in the newly formed TESS-India project with the UK Open University.

He also referred back to Project High Tech: Teaching English Communicatively, which he had presented at the EuroCALL conference on OER in Language Teaching. This was an ELT teacher training project carried out over two years across 200 schools in Rajasthan as a joint collaboration between the US Embassy and the Department of Higher Education with the Government of Rajasthan. Sustainability is a key issue with any project that intends to create and manage new teaching practices at the levels of policy making, curriculum planning and teaching. The scale of this project was large and Dr. L.P. Mahawar was concerned about the lack of incentives for teachers to stay motivated  by the project goals beyond the funded period:

“The sustainability and viability of such open resources can be effected only when the teachers, planners and policy makers develop a sense of social responsibility, and only when the teachers, educators or other practitioners get kudos for their efforts in terms of rewards, awards, medals and trophies or whatever may be the form of reinforcement and encouragement. The teachers should be given credits on point system like the Academic Performance Indicator; their voluntary work should be linked to promotional and incremental opportunities; their efforts to create authentic open resources ranked equivalent to good research work and their work load of teaching hours reduced in proportion to the quantity of open resources they propose to create or have created.” (Malawar, 2012)

Saraswati, the Hindu goddess of education and the creative arts. Image via flickr

Saraswati [Sah-rah-swah-tee] is the Hindu Goddess of education and the creative arts and is often depicted holding a stringed instrument with a book at her feet. Indian mothers are known to pray to her for their children’s success in school. The Dalit or ‘untouchables’ of India have had a somewhat turbulent history with Hinduism, however, and have fought hard not to be banned from worshipping in Hindu temples due to their low caste (Misra, 2007). Another struggle for the Dalit centres on access to English language education as many Dalit view English as the tool for emancipation, leading to better paying jobs and a stake in the current Indian technology boom to escape the cycle of poverty. In the context of post-colonial India, some have even intimated that prayers to Saraswati for help with learning English might result in falling out of favour with this Goddess (Gopalkrishna, 2012).

English-medium education in India is still primarily the domain of the higher castes. One of India’s most well-known 20th century freedom movement advocates and pro English language campaigners, Bhimrao Ramji Ambedkar, was a leading figure in drafting India’s new constitution. He was also a Dalit or ‘untouchable’.  So, the Dalit have decided to build a temple to a new Deity, the Goddess of English. As can be seen at the top of this post, she is depicted for her work in helping the Dalit with their 21st century English language communication aspirations, standing on a computer pedestal and holding a pen up high in one hand and the Indian constitution in the other. In an article with the Guardian Weekly online newspaper in 2011, India’s outcasts put faith in English, Amarchand Jauhar, an English teacher who was supervising the temple’s construction in Banka village in northern Uttar Pradesh, was interviewed as saying, “Without English, nothing is possible for us Dalits” (Rahman, 2011).

Naturally, English language education is a politically loaded subject in India as it is in most parts of the world. Indeed, both the ELT industry and the open education movement have been accused of spreading linguistic imperialism (Phillipson, 1992; Pennycook 1995 & 1998). Added to this, the prevalence and dominance of the ELT industry internationally along with the promotion of English-medium OER from well-funded initiatives make it difficult for those working in under-resourced contexts to compete for the uptake of non-English OER on an international scale.

Nationalist interests for not promulgating what many have seen as the enslaving tool of the British Raj is one argument against English-medium education. For pro Kannada-medium educationalists and activists in the state of Kanataka where the local government was proposing English education for the Dalit and other low caste peoples, the preservation and promotion of local languages in state-run education is another argument. The government proposal has since been scrapped, leading Dalit activists and scholars to question whether there is a hidden political agenda to isolate Dalit and other low caste peoples from accessing English (Gopalkrishna, 2012).

To provide further perspective on English in India in a 2005 lecture at Oxford University, India’s still current prime minister, Manmohan Singh, upon receiving an honorary degree from his alma mater, reflected upon the great legacy British education and the English language had left for India in the current age of globalisation:

“It used to be said that the sun never sets on the British Empire. I am afraid we were partly responsible for sending that adage out of fashion! But, if there is one phenomenon on which the sun cannot set, it is the world of the English-speaking people, in which the people of Indian origin are the single largest component. Of all the legacies of the Raj, none is more important than the English language and the modern school system…In indigenising English, as so many people have done in so many nations across the world, we have made the language our own. Our choice of prepositions may not always be the Queen’s English; we might occasionally split the infinitive; and we may drop an article here and add an extra one there. I am sure everyone will agree, nevertheless, that English has been enriched by Indian creativity and we have given you back R.K. Narayan and Salman Rushdie. Today, English in India is seen as just another Indian language.”  (Singh, 2005)

Indeed, the continuation of English’s position as the international lingua franca in research, higher education and business is wholly dependent on it being owned by non-native English speakers (Graddol, 2006). With the escalating pressure to be able to function in English in order to get ahead in life, can a balance be struck by making high-quality and flexible English language resources open to those individuals and communities that would otherwise be unable to afford English-medium and English language education? After all, if English is to remain the international lingua franca, then surely it stands to reason that we view English simply for what it is? One of many linguistic communication tools for accessing and building knowledge on a global scale and one that should be accessible to all in the same way that access to the Internet should be a given for all.

Delhi University OER par excellence

Through the OER University Google Groups network I came into contact with Professor Vinod Kumar Kanvaria, faculty and educational technologist of the Department of Education at the University of Delhi. Fifty students from two different programs, Educational Technology and Pedagogy of English, had taken active roles in preparing the day’s events at what was formerly known as the Central Institute for Education (CIE).  India’s first Education Minister, Maulana Azad with then Prime Minister Pandit Jawaharlal Nehru, had helped to establish CIE in 1947, envisioning an institution to do more than just “turn out teachers who would be ‘model teachers’, but to evolve into a research centre for solving new educational problems for the country” (see CIE website).


Professor Vinod Kumar, Alannah Fitzgerald and Sirawon Chahongnao at Delhi University Central Institute of Education, OER International Programme January 2013

From having engaged with Professor Kanvaria’s students for a full day and having observed the high levels of awareness around OER and OEP, I quickly came to the conclusion that these future educationalists are passionate about making a difference in Indian education through technology and openness. They are cognizant of the fact that eLearning is not yet a reality in most Indian schools and are taking their own mobile electronic devices equipped with portable speakers into classrooms where they are doing their section training. They realize the potential for eLearning is immense and more importantly, it is what the students are motivated by and would like to see more of in school.

Over a delicious traditional Indian lunch prepared by Delhi University staff, Professor Kanvaria showed me a range of high-quality paper-based OER course packs that he and his colleagues had put together for training teacher educators with OER  (Kanvaria 2013a; Kanvaria 2013b). The students who dined with us said the open educational resources used in their courses were very well received by the students and said they would be keen to transfer this open educational practice to their own development of teaching and training resources in future workplaces. Needless to say, it was most impressive to see a new generation of educationalists and learning technologists being taught by OER specialists.

In feedback to the presentation and workshop, students said they realized the deeper importance of sharing to develop not only themselves as open educational practitioners but their respective fields also. One student made the observation that a lot of the ELT lesson plan sharing sites that were once free are now asking for some form of payment and that it was difficult to find truly open educational resources in ELT. She was happy to have discovered Russell Stannard’s Teacher Training Videos (TTV) through the workshop as a useful starting point for web-based language resources.  Professor Kanvaria made a very good point about the blurred line between open and free resources in relation to uploading OER to proprietary platforms such as iTunesU and closed university websites. His point being that opportunities for user feedback are being missed when institutions such as Oxford do not create open interactive spaces and platforms, even on their university website, that encourage the re-uploading of re-mixed and re-purposed OER to show what people are doing with their OER. However, individual Oxford academics have received plenty of positive feedback on their OpenSpires podcasts from audiences, including the following:

“I have recently enrolled in the [……] University with the plan to complete a BA in Philosophy, but the first unit I have had to complete is a Study Skills unit which has been so boring and mundane I have been questioning whether to continue or not. Your enthusiasm for philosophy is infectious and put me back on course to continue my studies. Thanks again.”

“Can I just say how utterly engrossing they are – and how completely stimulating. I completed my undergraduate studies a great number of years ago, but listening to you lecture makes me yearn for study.” (Highton, Fresen and Wild, 2011 p.35)


