Radio Ga Ga by Queen
Radio Ga Ga by Queen

This is the fourth satellite post from the mothership post, Radio Ga Ga: corpus-based resources, you’ve yet to have your finest hour. I have also made the complete hyperlinked post (in five sections) available as a .pdf on Slideshare.

Radio 4

A lot of talk around defining current and trending practices in EAP can be tuned into via open as well as proprietary channels. In this section, I will refer to new-found open practices in EAP which are embracing Web 2.0 technologies amidst a backdrop of closed practices in EAP academic publishing and within subscription-only EAP memberships. I will open up discussion around these different practices within EAP to sketch out common ground for where EAP could be heading with respects to global outreach.

Toward open practices in EAP

Recent months have evidenced a steady opening up of practices for sharing expertise and resources in EAP. The new EAP teaching blog based at Nottingham University as a discussion-based side-shoot to their new Masters programme in EAP teaching makes use of the most widely used open-source blogging software, WordPress. Thanks to our friends in Canada, EAP tweetchat sessions are run on twitter with the hashtag #EAPchat every first and third Monday of the month, bringing together EAP practitioners who wish to participate in global EAP discussions as well as suggest topics for upcoming tweetchat sessions. An archived transcript page is available at the end of each EAPchat twitter session.

Free webinars from Oxford University Press (OUP), the largest academic publishing house in the world, are also broadcasting talk on EAP to the world. Julie Moore who has collaborated on the new Oxford EAP book series has also contributed free webinars with OUP attended by EAP practitioners from around the world. A review of one of Julie’s webinars on academic grammar can be found on the OUP-sponsored ELT global blog. Wouldn’t it be great if more EAP practitioners opened up their practice in this way to suggest areas of expertise in EAP that they would like to contribute and broadcast via webinars with OUP’s considerable market outreach?

The EAP community in the UK mainly gathers around BALEAP with their Professional Issues Meetings, accreditation scheme, biennial conference and lively email discussion list. There is a noticeable push-pull between open and closed EAP practices within BALEAP which I would like to bring into the open for discussion. Openness was built into the Durham PIM on the EAP Practitioner in June of this year to make this the first BALEAP event to have a twitter hashtag thanks to forward thinking from Steve Kirk. Since this PIM he has also been curating a useful EAP practitioner resources site with!

There does seem to be a willingness on the part of BALEAP members to explore with new technologies so that their discussions around issues on EAP are openly available. However, the BALEAP email discussion list which I mentioned above is the only one of half a dozen similarly JISC-hosted email discussion lists that I belong to which is closed off by the BALEAP membership subscription pay-wall. The others which I subscribe to for free are all open, and discussion transcripts from their contributing members can be searched via the web through the JISC email archives. This has been a BALEAP executive committee decision to keep the email discussion list closed and I question whether this decision best reflects the current drive toward openness among BALEAP members who are interested in sharing their insights and expertise with those around the world for whom BALEAP membership is not an affordable option.

BALEAP recently added the strap-line the global forum for EAP practitioners to its website. Formerly the British Association of Lecturers in EAP (hence the continuity from the acronym to the name BALEAP), some of their event and research outputs can be found on their website but others can only be accessed via the subscription-only Journal of English for Academic Purposes (JEAP). And, you can probably guess where I’m going here with concerns around openness or lack thereof with respects to being the global EAP practitioner forum…

Nonetheless, an invaluable EAP resource that BALEAP have put out onto the wild web is the EAP teacher competency framework. An EAP practitioner portfolio mentoring programme is currently in the pilot stages and there is talk of matching EAP teaching competencies in BALEAP with the UK Professional Standards Framework (UKPSF) at the HEA, but once again for those non-UK and freelance EAP practitioners who do not work for UK higher education institutions that subscribe to the HEA such an alignment of frameworks may not be suitable or relevant. That said, the essence of the UKPSF is useful and perhaps with the current OER International programme at the HEA we can see ownership of the UKPSF go international? HEA accreditation as a UK body will remain a reality, however, so it will be interesting to see what the HEAL working party at BALEAP who are collaborating with the HEA will come up with in response to shaping the identity of BALEAP who aspire to be known as the global forum for EAP practitioners.

Having recently formed a Web Resources Sub Committee (WRSC) with other technologically and OER oriented EAPers at BALEAP we may yet see things open up.  Below is the presentation Ylva Berglund Prytz and myself (both on the WRSC at BALEAP) gave on Openness in English for Specific Academic Purposes (ESAP) at the PIM in Sheffield in November, 2011.

Elsevier are the publishers of JEAP and from experience open access in academic publishing has come about through the pressure tactics of certain academic communities of practice lobbying for green and gold standard open access publications in their representative fields. Open Access week – set the default to open is coming up again on October 22nd.

Moving to open access research publications all depends on the culture of the academic research community. It will take those EAP practitioners and researchers working in privileged and well-resourced institutions that can easily afford institutional subscriptions to memberships like BALEAP to seriously consider open access and the potential for global reach of research into EAP. It will also take those EAP practitioners who are working off their institutional radars, so to speak, and who are experimenting with Web 2.0 technologies to get their message and expertise out there for global interaction around issues in EAP practice and research. Something I picked up from Steve Kirk’s! account is a recent book setting an open trend in EAP publishing, Writing Programs Worldwide: Profiles of Academic Writing in Many Places which is published in a free digital online format as well as a pay-for print version. This echoes what publishers are doing with big names in more open fields such as the Bloomsbury Academic publication of The Digital Scholar by Martin Weller. Exciting times and opportunities lie ahead for EAP publishing.

English for Specific Academic Purposes with data driven learning resources

It seems to be no great coincidence that Tim Johns who coined the term Data Driven Learning (DDL) in 1994 had also come up with the term English for Academic Purposes (EAP) in 1974 (Hyland, 2006). According to Chris Tribble’s preliminary results from his latest survey in-take on DDL (announced at the TaLC closing keynote address), EAP practitioners still make up a high percentage of those who took the survey, indicating greater uptake of corpus-based resources and practices in EAP than those in EFL / ESL, for example.

Open corpus-based tools and resources have the potential to equip and enable EAP practitioners to develop relevant ESAP materials. Awareness of and training in these open corpus-based resources will need to be shared across the EAP community, however, to ensure that we are crowd-sourcing our expertise and our resources in this area.  If you click on the image below this will take you to a talk I gave at the Open University in the UK on addressing academic literacies with corpus-based OER. This was inspired by the Tribble DDL survey and the lead up to the TaLC10 conference. It was an added bonus to have one of the BAWE corpus developer team members in the audience that day and to receive positive feedback on how FLAX have opened up the BAWE in collaboration with TOETOE and the Learning Technologies Group Oxford.

OU video presentation on Addressing Academic Literacies with open corpus-based resources

Over the course of this academic year FLAX and TOETOE will continue to build onto work around opening up research corpora like the BAWE and the BNC managed by the Oxford Text Archive for developing resources for ESAP.  We will also be engaging with various stakeholder groups through f2f workshops, online surveys and interviews for open corpus-based resources evaluation which I will be sharing insights from on this blog.

One final word on OER and where corpus-based resources might play a significant role in making higher education more accessible to the estimated 100 million learners worldwide who currently qualify to study at university level but do not have the means to do so (UNESCO, 2008). Because English is the educational lingua franca, open educationalists are going to source support resources for academic English from the approaches and materials that are currently popular and openly available to re-use under creative commons licences. This throws up interesting issues around specificity in EAP for supporting learners with discipline-specific English.

A parallel universe in EAP materials development

Cartoon image referred to by Niko Pfund, USA president of OUP in podcast on Ebooks, Reading and Scholarship in a Digital Age

It would be an understatement to say that the academic publishing world is undergoing a radical transformation with the arrival of digital and open publishing formats which are democratising publishing as we know it. Niko Pfund, President of Oxford University Press (USA), discusses the ways in which technology affects reading, scholarship, publishing and even thinking in a presentation he gave at Oxford recently which you can access by clicking on the cartoon image above.

I learned a lot from this podcast, including OUP’s commitment since 2003 to publishing all research monographs in both digital and print formats. I also learned of their admiration for what Wikipedians have done for opening up knowledge and publishing through human crowd-sourcing that utilises open technologies and platforms. A parallel drawn here to something that was brought up repeatedly at the EduWiki conference is how academic publishing houses like OUP are well placed to open up the disciplines in the same way as Wikipedia by bringing the voices of the academy into the public sphere through more accessible means of communication than research, and by effectively linking this research to current world events to gain wider relevance and readership.

Pfund refers to messy experimental times in academic publishing with lots of new business models currently being explored for spear-heading changes in publishing. OUP heavily subsidise and give away a lot of published resources including ELT textbooks to the developing world, but not yet under open licences (someone please correct me if I’m wrong here) for those practitioners working in under-resourced communities so that they can re-mix and re-distribute these same resources.

OUCS and OUP are literally down the road from one another, a parallel universe as it were. The former is research, learning and teaching focused with a strong commitment to public scholarship, and the later is focused on exploring new practices and business models for delivering the best in academic publishing. Arguably, there is a lot of overlap that can be tapped into here for the collaborative development of open corpus-based resources and practices for the global ELT market.

In-house EAP materials development

EAP teachers have been developing in-house EAP materials in response to the generic EAP teaching resources available on the mainstream market as a means to meeting the real needs of their students going onto all number of degree programmes. However, as I mentioned in section 2 of this blog post, many of these in-house EAP materials make use of third party copyrighted texts and therefore cannot be shared beyond the secret garden of the classroom or the institutional password-protected VLE. An enormous opportunity presents itself here to EAP practitioners and corpus linguists alike to push out resources in English for Specific Academic Purposes (ESAP) using open Data-Driven Learning (DDL) methods, texts, tools and platforms for sharing OER for ESAP. A significant cultural shift in practice will be required, however, to realise this vision for developing flexible and open ESAP resources that can be adapted for use in multiple educational contexts both off- and on-line. Once again, in subsequent blog posts, I will be presenting open educational practices and open research methods to open up discussion for ways forward with this particular global EAP vision.


Alexander, O., Bell, D., Cardew, S., King, J., Pallant, A., Scott, M., Thomas, D., & Ward Goodbody, M. (2008) Competency framework for teachers of English for Academic Purposes, BALEAP.

Hyland, K. (2006). English for Academic Purposes: An Advanced Handbook. London: Routledge.

Johns, T. (1994). From Printout to Handout: Grammar and Vocabulary Teaching in the Context of Data-driven Learning. In Odlin, T. (ed.), Perspectives on Pedagogical Grammar: 27-45. Cambridge: Cambridge University Press.

Radio Ga Ga by Queen via Deviant Art
Radio Ga Ga by Queen via Deviant Art

This is the third satellite post from the mothership post, Radio Ga Ga: corpus-based resources, you’ve yet to have your finest hour. I have also made the complete hyperlinked post (in five sections) available as a .pdf on Slideshare.

Radio 3

I confess that I spend most of my time listening to BBC Radio 3. The parallel that I will draw here is that I was never formally educated in classical music in the same way as I have never worked toward formal qualifications in corpus linguistics during any of my studies. Because I am working broadly across the areas of language resources development and enhancing teaching and learning practices through technology it was only a matter of time, however, before I started exploring and toying with corpus-based resources. I met Dr. Shaoqun Wu of the FLAX project while at a conference in Villach, Austria in 2006 and by 2007 I had begun to delve into the world of open-source digital library collections development with the University of Waikato’s Greenstone software, developed and distributed in cooperation with UNESCO, for realising the much broader vision of reaching under-resourced communities around the world with these open technologies and collections.

Bridging Teaching and Language Corpora (TaLC)

Let’s fast forward to the 2012 Teaching and Language Corpora Conference in Warsaw, Poland. Although I have participated in corpus linguistics conferences before, this was my first time to attend the biennial TaLC conference. TaLCers are very much researchers working in the area of corpus linguistics and DDL and this conference was themed around bridging the gap between DDL research and uses for corpus-based resources and practices in language teaching and learning.

One of the keynote addresses from James Thomas, Let’s Marry, called for greater connectedness in pursuing relationships between those working in DDL research and those working in pedagogy and language acquisition. At one point he asked the audience to make a show of hands for those who knew of big names in the ELT world, including Scrivener, Harmer and Thornbury. Only a few raised their hands. He also made the point that these same ELT names don’t make their way into citations for research on DDL. Interestingly, I was tweeting points made in the sessions I attended to relevant EAP and ELT / EFL / ESL communities online without a TaLC conference hashtag. It would’ve been great to have the other TaLCers tweeting along with me, raising questions and noting key take-away points from the conference to engage interested parties who could not make the conference in person and to catalogue a twitterfeed for TaLC that could be searched by anyone via the Internet at a later point in time. It would’ve also been great to record keynote and presentation speakers as webcasts for later viewing. When approached about these issues later, however, the conference organisers did express interest in ways of amplifying their events by building such mechanisms for openness into their next conference.

Prising open corpus linguistics research in Data Driven Learning (DDL)

Problems with accessing and successfully implementing corpus-based resources into language teaching and learning scenarios have been numerous.  As I discussed in section 2 of this blog, many of the concordancing tools referred to in the research have been subscription-based proprietary resources (for example, the Wordsmith Tools), most of which have been designed for at least the intermediate-level concordance user in mind. These tools can easily overwhelm language teaching practitioners and their students with the complex processing of raw corpus data that are presented via complex interfaces with too many options for refinement. Mike Scott, the main developer of the Wordsmith Tools has also released a free version of his concordancing suite with less functionality and this would suffice for many language teaching and learning purposes. He attended my presentation on opening up research corpora with open-source text analysis tools and OER and was very open-minded as were the other TaLCers whom I met at the conference regarding new and open approaches for engaging teachers and learners with corpus-based resources.

There are many freely available annotated bibliographies compiled by corpus linguists which you can access on the web for guidance on published research into corpus linguistics. Many researchers working in this area are also putting pre-print versions of their research publications on the web for greater access and dissemination of their work, see Alex Boulton’s online presence for an example of this. Also hinted at earlier in part 2 of this blog are the closed formats many of this published research takes, however, in the form of articles, chapters and the few teaching resources available that are often restricted to and embedded within subscription-only journals or pricey academic monographs.  For example, Berglund-Prytz’s ‘Text Analysis by Computer: Using Free Online Resources to Explore Academic Writing’ in 2009 is a great written resource for where to get started with OER for EAP but ironically the journal it is published in, Writing and Pedagogy, is not free. Lancaster University is home to the openly available BNCweb concordancing software which you only need register for to be able to install a free standard copy on your personal computer. A valuable companion resource on BNCweb was published by Peter Lang in 2008 but once again this is not openly accessible to interested readers who cannot afford to buy the book. The great news is that the main TaLC10 organiser, Agnieszka Lenko, has spearheaded openness with this most recent event by trying to secure an Open Access publication for the TaLC10 proceedings papers with Versita publishers in London.

DIY corpora with AntConc in English for Specific Academic Purposes (ESAP)

At TaLC10 I discovered a lot of overlap with Maggie Charles’ work on building DIY corpora with EAP postgraduate students using the AntConc freeware by Laurence Anthony. We had also included workshops on AntConc for students in our OER for EAP cascade at Durham so it was great to see another EAP practitioner working in this way who had gathered data from her on-going work in this area for presentation and discussion at the conference. Many of her students at the University of Oxford Language Centre are working toward dissertation or thesis writing which raises interesting questions around enabling EAP students to become proficient in developing self-study resources for English for Specific Academic Purposes (ESAP). Her recent paper in the English for Specific Purposes Journal (2012) points to AntConc’s flexibility for student use due to it being freeware that can be installed on any personal computer or flash-drive key for portable use. Laurence Anthony’s website also offers a lot of great video training resources for how to use AntConc. The potential that AntConc offers for building select corpora to those students currently pursuing inter-disciplinary studies in higher education is also noted by Charles. Having said this, drawbacks with certain more obscure subject disciplines, for example Egyptology (Ibid.), that had not yet embraced digital research cultures and were still publishing research in predominantly print-based volumes or image-based .pdf files made the development of DIY corpora still beyond the reach of those few students.

Beyond books and podcasts through linking and crowd-sourcing

While presenting on the power of linked resources within the FLAX collections and pushing these outward to wider stakeholder communities through TOETOE, I came across another rapid innovation JISC-funded OER project at the Beyond Books conference at Oxford. The Spindle project, also based at the Learning Technologies Group Oxford, has been exploring linguistic uses for Oxford’s OpenSpires podcasts with work based on open-source automatic transcription tools. Automatic transcription is often accompanied with a high rate of inaccuracy. Spindle has been looking at ways for developing crowd-sourcing web interfaces that would enable English language learners to listen to the podcasts and correct the automatic transcription errors as part of a language learning crowd-sourcing task.

Automatic keyword generation was also carried out in the SPINDLE project on OpenSpires project podcasts, yielding far more accurate results. These keyword lists which can be assigned as metadata tags in digital repositories and channels like iTunesU offer further resource enhancement for making the podcasts more discoverable. Automatically generated keyword lists such as these can also be used for pedagogical purposes with the pre-teaching of vocabulary, for example. The TED500 corpus by Guy Aston which I also came across at TaLC10 is based on the TED talks (ideas worth spreading) which have also been released under creative commons licences and transcribed through crowd-sourcing.

The potential for open linguistic content to be reused, re-purposed and redistributed by third parties globally, provided that they are used in non-commercial ways and are attributed to their creators, offers new and exciting opportunities for corpus developers as well as educational practitioners interested in OER for language learning and teaching.


Anthony, L. (n.d.). Laurence Anthony’s Website: AntConc.

Berglund-Prytz, Y (2009). Text Analysis by Computer: Using Free Online Resources to Explore Academic Writing. Writing and Pedagogy 1(2): 279–302.

British National Corpus, version 3 (BNC XML Edition). 2007. Distributed by Oxford University Computing Services on behalf of the BNC Consortium.

Charles, M. (2012). ‘Proper vocabulary and juicy collocations’: EAP students evaluate do-it-yourself corpus-building. English for Specific Purposes, 31: 93-102.

Lexical Analysis Software & Oxford University Press (1996-2012). Wordsmith Tools.

Hoffmann, S., Evert, S., Smith, N., Lee, D. & Berglund Prytz, Y. (2008). Corpus Linguistics with BNCweb – a Practical Guide. Frankfurt am Main: Peter Lang.

Radio Ga Ga album cover by Queen via Wikipedia

These past few months I’ve been tuning into a lot of different practitioner events and discussions across a range of educational communities which I feel are of relevance to English language education where uses for corpus-based resources are concerned. There’s something very distinct about the way these different communities are coming together and in the way they are sharing their ideas and outputs. In this post, I will liken their behaviour to different types of radio station broadcast, highlighting differences in communication style and the types of audience (and audience participation) they tend to attract.

I’ve also been re-setting my residential as well as my work stations. No longer at Durham University’s English Language Centre, I’m now London-based and have just set off on a whirlwind adventure for further open educational resources (OER) development and dissemination work with collaborators and stakeholders in a variety of locations around the world. TOETOE is going international and is now being hosted by Oxford University Computing Services (OUCS) in conjunction with the Higher Education Academy (HEA) and the Joint Information Systems Committee (JISC) as part of the UK government-funded OER International programme.

I will also be spreading the word about the newly formed Open Education Special Interest Group (OESIG), the Flexible Language Acquisition (FLAX) open corpus-based language resources project at the University of Waikato, and select research corpora, including the British National Corpus (BNC) and the British Academic Written English (BAWE) corpus, both managed by OUCS, which have been prised open by FLAX and TOETOE for uses in English as a Foreign Language (EFL) – also referred to as English as a Second Language (ESL) in North America – and English for Academic Purposes (EAP). Stay tuned to this blog in the coming months for more insights into open corpus-based English language resources and their uses in different teaching and learning contexts.

This post is what those in the blogging business refer to as a ‘cornerstone’ post as it includes many insights into the past few months of my teaching fellowship in OER with the Support Centre in Open Educational Resources (SCORE) at the Open University in the UK. Many posts within one as it were. This post also provides a road map for taking my project work forward while identifying shorter blogging themes for posts that will follow this one. This particular post will also act as the mother-ship TOETOE post from which subsequent satellite posts will be linked.  Please use the red menu hyperlinks in the section below to dip in and out of the four main sections of this blog post series. I have elected to choose this more reflective style of writing through blogging so that my growing understandings in this area are more accessible to unanticipated readers who may stumble upon this blog and hopefully make comments to help me refine my work. Two more formal case studies on my TOETOE project to date will be coming out soon via the HEA and the JISC.

I have also made this hyperlinked post (in five sections) available as a .pdf on Slideshare.

Which station(s) are you listening to?

BBC Radio has been going since 1927. With audiences in the UK, four stations in particular are firm favourites: youth oriented BBC Radio 1 featuring new and contemporary music; BBC Radio 2 with middle of the road music for the more mature audience; high culture and arts oriented BBC Radio 3, and; news and current affairs oriented BBC Radio 4. Of course there are many more stations but these four are very typical of those found around the world. What is more, I’ve selected these four very distinct stations as the basis to build a metaphor around the way four very distinct educational practitioner communities are intersecting with corpus-based language teaching resources. This metaphor will draw on thought waves from the following:

Radio 1 – what’s new and hip in open corpus-based resources and practices

Radio 2 – the greatest hits in ELT materials development and publishing

Radio 3 – research from teaching and language corpora

Radio 4 – The current talk in EAP: open platforms for defining practice