The doge meme teaches us so much about language learning and how challenging it can be to accurately combine words and patterns when using another language. The FLAX language system teaches us so much about how we can avoid using dodgy language by employing powerful open-source language analysis tools and authentic language resources.
The FLAX (Flexible Language Acquisition) project has won the LinkedUp Vici Competition for tools and demos that use open or linked data for educational purposes. This post is the one I wrote to accompany our project submission to the LinkedUp challenge.
FLAX is an open-source software system designed to automate the production and delivery of interactive digital language collections. Exercise material comes from digital libraries (language corpora, web data, open access publications, open educational resources) for a virtually endless supply of authentic language learning in context. With simple interface designs, FLAX has been designed so that non-expert users — language teachers, language learners, subject specialists, instructional design and e-learning support teams — can build their own language collections.
The FLAX software can be freely downloaded to build language collections with any text-based content and supporting audio-visual material, for both online and classroom use. FLAX uses the Greenstone suite of open-source multilingual software for building and distributing digital library collections, which can be published on the Internet or on CD-ROM. Issued under the terms of the GNU General Public License, Greenstone is produced by the New Zealand Digital Library Project at the University of Waikato, and developed and distributed in cooperation with UNESCO and the Human Info NGO.
REMIX WITH FLAX
At FLAX we understand that content and data vary in terms of licensing restrictions, depending on the publishing strategies adopted by institutions for the usage of their content and data. FLAX has, therefore, been designed to offer a flexible open-source suite of linguistic support options for enhancing such content and data across both open and closed platforms.
Featuring the Latest in Artificial Intelligence &
Natural Language Processing Software Designs
Within the FLAX bag of tricks, we have the open-source Wikipedia Miner Toolkit, which links in related words, topics and definitions from Wikipedia and Wiktionary as can be seen below in the Learning Collocations collection (click on the image to expand and visit the toolkit in action).
Featuring Open Data
Available on the FLAX website are completed collections and on-going collections development with registered users. Current research and development with the FLAX Law Collections is based entirely on open resources selected by language teachers and legal English researchers as shown in the table below. These collections demonstrate how users can build collections in FLAX according to their interests and needs.
Law Collections in FLAX
Type of Resource
Number and Source of Collection Resources
Open Access Law research articles
40 Articles (DOAJ – Directory of Open Access Journals, with Creative Commons licenses for the development of derivatives)
MOOC lecture transcripts and videos (streamed via YouTube and Vimeo)
15 Lectures (Oxford Law Faculty, Centre for Socio-Legal Studies and Department of Continuing Education)
PhD Law thesis writing
50-70 EThoS Theses (sections: abstracts, introductions, conclusions) at the British Library (Open Access but not licensed as Creative Commons – permission for reuse granted by participating Higher Education Institutions)
Linking in lexico-grammatical phrases from the British National Corpus (BNC) of 100 million words, the British Academic Written English corpus (BAWE) of 2500 pieces of assessed university student writing from across the disciplines, and the re-formatted Wikipedia corpus in English.
Linking in a reformatted Google n-gram corpus (English version) containing 380 million five-word sequences drawn from a vocabulary of 145,000 words.
FLAX Training Videos
Featuring Game-based Activities
Click on the image below to explore the different activities that can be applied to language collections in FLAX.
FLAX Apps for Android
We also have a suite of free game-based FLAX apps for Android devices. Now you can interact with the types of activities listed above while you’re learning on the move. Click on the FLAX app icon to the right to access and download the apps and enjoy!
A collaborative investigation is underway with FLAX and the Open Educational Resources Research Hub (OERRH), whereby a cluster of revised OER research hypotheses are currently being employed to evaluate the impact of developing and using open language collections in FLAX with informal MOOC learners as well as formal English language and translation students.
Current activity within open education can be characterised as having reached a beta phase of maturity. In much the same way that software progresses through a release life cycle, beta is the penultimate testing phase, after the initial alpha-testing phase, whereby the software is adopted beyond its original developer community.
Open education has now come to the attention of the mainstream press and traditional higher education, with the uptake of Open Educational Resources (OER) and with the advent of Massive Open Online Courses (MOOC). The participating masses can be likened to beta testers of these newly opened ways of educating. And, as with many recent software hits from Internet giants such as Google (e.g. Gmail), it is highly likely that open education will remain in a state of ‘perpetual beta’ development and testing, as we investigate and measure the impact of openness on education.
Funded by the William and Flora Hewlett Foundation, the OER Research Hub (OERRH) is currently spear-heading the testing of OER hypotheses and is aggregating research findings through their OER Impact Map. The beta testing metaphor is also relevant to my research with the FLAX language project for the open development and testing of the FLAX Open Source Software (OSS). I have been promoting the FLAX OSS language system across different educational contexts (Fitzgerald, 2013), and I am now investigating user experiences of the software across multiple research sites in order to involve users in language collections building and further development of the OSS. I will be posting findings from this research on the TOETOE project blog throughout this year.
According to publisher and open source advocate, Tim O’Reilly:
Users must be treated as co-developers, in a reflection of open source development practices (even if the software in question is unlikely to be released under an open source license.) The open source dictum, ‘release early and release often‘, in fact has morphed into an even more radical position, ‘the perpetual beta’, in which the product is developed in the open, with new features slipstreamed in on a monthly, weekly, or even daily basis. It’s no accident that services such as Gmail, Google Maps, Flickr, del.icio.us, and the like may be expected to bear a ‘Beta’ logo for years at a time. (O’Reilly, 2005)
Open Fellowship with the OER Research Hub at the UK Open University
My first introduction to the UK Open University, henceforth referred to here as the OU, was when my Dad took me to see the film Educating Rita in 1985. It took two years to reach our picture house in provincial-town New Zealand, and I was just at that age – twelve going on thirteen – to appreciate this Pygmalion story of a woman breaking through the class barriers with an emancipatory distance education from the OU. My Dad also took me canvasing with him for the NZ Labour Party in those formative years, showing me first-hand that life for those in state-housing areas was very different from life in homes belonging to those who had been to university.
I never imagined that I’d be at the OU but I am now on my second fellowship here, this time as an Open Fellow with the OERRH based at the Institution of Educational Technology, and previously from 2011-2012 as a SCORE Fellow with the Support Centre for Open Resources in Education. When Rita’s character was a student at the OU in the early 1980s, open meant that admissions barriers had been removed from entry to formal study. This is still true today with the OU’s 200,000 registered paying students coming from a variety of traditional and non-traditional backgrounds. Nonetheless, this is still ground-breaking when we consider that most of the brick ‘n’ mortar higher education institutions of the world, including those with online learning offerings, still maintain strict admissions policies based on entrance examinations and prerequisites. Open has come to mean much more than this, however, with the rapid ascension of OERs and MOOCs. And, the OU have been no strangers to this rise in informal education as demonstrated in their longstanding work with the BBC through their Open Media Unit, and in leading a bevy of wide-reaching open education projects, including OpenLearn and now FutureLearn.
Open Education Awash with Venture Capital
Open has come of age it seems, with pathways to courses, the sharing of courseware code and access to research becoming increasingly free and open to learners; and with models for educational delivery and accreditation being experimented with on an almost daily basis by educators and institutions. Getting an education is one thing but coming up with sustainable and workable solutions for the world’s problems is increasingly understood as something outside of our reach and beyond the actual remit of education. While we discuss how to come up with the best business models for selling MOOCs and higher education to the masses, it might behoove us to ask how we can occupy eduction to evolve sustainable communities (human and non-human) on this planet rather than continue to commodify learning, teaching and research as products for an increasingly globalised world.
Weller’s position paper on the battle for open (2013) echoes concerns from open education advocates on the distortion of key principles for openness in education (see Wiley, 2013); as being sold downstream through the imposed economic value system of a booming online education market (Education Sector Factbook, 2012). The open-washing of the open education movement, in favour of capitalising on ‘open’ education at a massive scale, is being viewed in much the same way as green activists view the green-washing of the green movement, with our world’s most pressing environmental problems playing second fiddle to the big business of so-called green solutions:
When they start offering solutions is the exact moment when they stop telling the truth, inconvenient or otherwise. Google “global warming solutions.” The first paid sponsor, www.CampaignEarth.org, urges “No doom and gloom!! When was the last time depression got you really motivated? We’re here to inspire realistic action steps and stories of success.” By “realistic” they don’t mean solutions that actually match the scale of the problem. They mean the usual consumer choices—cloth shopping bags, travel mugs, and misguided dietary advice—which will do exactly nothing to disrupt the troika of industrialization, capitalism, and patriarchy that is skinning the planet alive. But since these actions also won’t disrupt anyone’s life, they’re declared both realistic and a success. (Jenson, Keith & McBay, 2011)
Technology activists abound in support of the information wants to be free slogan from the 1960s. “Information wants to be free. Information also wants to be expensive. …That tension will not go away” (Brand, 1987). Activism that is focused on the tension surrounding the freedom of information continues to grow, but what of activism that is directed at the tension between education wanting to be open and education wanting to be exclusive? Education wanting to be for life and education wanting to be for jobs only? When will we witness the scaling of massive buildings like the Shard in London by education activists – let’s call one of them Rita – in protest of formal education’s direct relationship with the limitations of commercialization? When will we raise the red flag on the global business of buying and selling education as an endgame in itself?
The purpose of education is going untested in real terms and the open education movement has only just begun educating in beta, as it were, by drawing on a pedagogy of abundance rather than a perceived pedagogy of scarcity (Weller, 2011). This shift in awareness and practice echoes Stewart Brand’s comments to Steve Wozniak, at the first Hackers’ Conference in 1984, on how information wants to be free due to the cost of getting digitised information out becoming lower and lower. The economics of learning materials (Thomas, 2014), following a recent discussion on the oer-discuss list about the progression from reusable learning objects to open educational resources, marks another useful distinction using Marxist terminology, between learning materials that have exchange versus use value:
In the discussions about whether content has value, there is often a question about whether content can be bought and sold, whether it is “monetisable”. In marxist economics that is the type of value called exchange value: where a commodity can be exchanged for money. There is another type of value: use value. That is the extent to which a commodity is useful. It is about its utility, not its cost or price. I think most teaching resources can have a high use value both for primary use and secondary reuse, without that ever translating into an exchange value. They might be valuable but you can’t sell them. (Thomas, 2014)
It may be that Rita will draw on learning content and interactions from a variety of accessible places, including open publications and MOOCs, where ‘open’ equals free access only (for example, All Rights Reserved Coursera courses) rather than where open equals free plus legal rights to reuse, revise, remix and redistribute. It may also be that Rita will only begin to realise the use value of these educational resources – perhaps through joining Greenpeace or the Deep Green Resistance, for example – by synthesisng her contributions with those of her peers for the development of a learning community that is informal, networked and open. And, most importantly, where her developing awareness will actively challenge the perpetuation and escalation of global problems that are on a truly massive scale.
In critiquing open education, Audrey Watters, in her keynote address at the Open Education 2013 conference, also proposes communities rather than technology markets as the saviors of education:
Where in the stories we’re telling about the future of education are we seeing salvation? Why would we locate that in technology and not in humans, for example? Why would we locate that in markets and not in communities? What happens when we embrace a narrative about the end-times — about education crisis and education apocalypse? Who’s poised to take advantage of this crisis narrative? Why would we believe a gospel according to artificial intelligence, or according to Harvard Business School [Christensen’s Disruptive Innovation theory, 2013], or according to Techcrunch…? (Watters, 2013)
Brand, S. (1987). The Media Lab: Inventing the Future at MIT. Viking Penguin, p. 202, ISBN0-14-009701-5.
This post is about how I came to be seduced by open educational practices (OEP).
TOETOE International blog series
After a period of radio silence, I have prepared a new series of blog posts on OEP in ELT based on my TOETOE International project with the University of Oxford, the UK Higher Education Academy (HEA) and the Joint Information Systems Committee (JISC). They will be released weekly from today leading up to my presentation at the OER13 Conference in Nottingham in April, Stories from the Open Frontier of English Language Education Resources. These posts are a version of the case study I have prepared for this project, FLAX Weaving with Oxford Open Educational Resources, which will be published by the HEA/JISC as an OER later this year. I have also made this post in the OEP series available as a .pdf on Slideshare.
I have assembled these posts into ethnographic accounts (LeCompte & Schensul 1999:17; Clifford 1990:51-52) to stop the clock as it were and to reorder the recent past that has been observed and jotted down; to systematize, contextualize and assemble the activity of the TOETOE International project across seven different countries. They will be part narrative and part design dialectic, drawing on stories and evaluations made by international stakeholders concerning the re-use of Oxford content: Oxford-managed corpora (the British National Corpus aka BNC and the British Academic Written English corpus aka BAWE) and Oxford-created OER (podcast lectures and seminars, images, essays, ebooks) in combination with other open English-medium content. Moreover, these evaluation narratives will continue to inform the design of open source digital library software for developing flexible open English language learning and teaching collections with the FLAX project (Flexible Language Acquisition) at the University of Waikato in New Zealand.
Thick descriptions (Geertz, 1973) will be presented from networked meetings, workshops, conference presentations and interviews with OER and ELT practitioners for arriving at better understandings of the social acts and symbols connected with the international open education movement. As part of the reflexive writing process, I have re-storied the stories of participating individuals and institutions, placing them in chronological sequence and providing causal links among ideas. Themes arising from the stories contain new metaphors for linking unfamiliar phenomena from each country represented with familiar concepts for understanding OER in the international context. Topics introduced by this TOETOE International blog series include: emancipatory English, Do-It-Yourself (DIY) open English language collections building, working OER into traditional ELT publications, and long-range planning for embedding OER and OEP within sustainable English language education.
What drives someone toward open educational practice?
The reasons will be numerous but the one that stands out for me is the capacity to work across the international open education network, either in online or face-2-face mode. Working across disciplinary, technological and geographical boundaries, my current practice seems very distant from the practice I was trained in all those years ago when I did the Cambridge Dip.TEFLA (now the DELTA) in Seoul, Korea. Nonetheless, everything that I do now in my new open educational practice is very much informed by my past teaching practice in traditional classroom-based EFL/ESL and EAP.
There are vast changes happening across education globally and there is a growing need for flexible and high quality open educational resources in English along with an expanded open infrastructure to support research, teaching, training, learning and curriculum development while English is the lingua franca in education, research and publishing. Indeed, the position of English as international lingua franca is wholly dependent on its use and ownership by non-native speakers of English (Graddol, 2006). However, the reality of a rapidly expanding global higher education industry (UNESCO 2008), where open and online distance education are fast becoming major players because of affordances with educational technologies, has yet to trickle down into the workflow of English language teaching practitioners working in traditional classroom-based education.
In one of the learning technology forums I belong to someone was asking after recommended PhD programmes; someone else replied that whatever area you do your PhD in you’d better be prepared to live and breathe your chosen PhD topic area for many years to come if not your whole career. With my PhD I have begun identifying flexible pathways for open educational resources and practices to be shared across traditional classroom-based and open online English language education, but I expect that I will be continuing with this inquiry for quite some time to come.
Somewhere OvER the Rainbow – the myth about OER quality in language resources
Not surprisingly, I was assigned to the Libraries and Languages presentation slot at OpenEd 2012 where the conference theme was Beyond Content. The presenters from the other project in this session, Developing Foreign Language Courses for the Open Library Project, seemed to be fairly new to OER and raised issues around OER quality, stating that they needed to work with professional resource developers and publishers to produce what appeared to me to be fairly ordinary audio recordings for target language items to be used in their project resources.
Put simply, publishing language resources with a reputable publishing house does not always guarantee quality in the same way that publishing with an open license does not always guarantee quality. The difference being that if you buy a course book and it turns out to be a lemon then you’re stuck with it – you either leave it on the shelf or you spend hours developing supplementary resources to ‘fix’ it. However, if you subscribe to an open educational practice model for materials development you can:
Re-use an OER and if it doesn’t work for you then you’re free to:
Re-vise / re-purpose;
Re-mix with other open (and proprietary content which you have cleared for use) and;
Re-distribute through a variety of open and proprietary channels.
These are the four Rs of OER (Wiley, 2009). A far cry from the materials development method I learned on the Cambridge CELTA and DipTEFLA modules which was to Select, Adapt, Reject and Supplement (SARS) course book materials from leading ELT publishers (Graves, 2003).
In addition to raising the point about quality with the other presenters in my Open Educationa 2012 session in the Q&A, during the lunch break I discussed the on-going myth about OER quality with one of my SCORE colleagues from the UK, Chris Pegler. I have been a big fan of her Resource Reuse Card Game (embedded below in Slideshare) from the ORIOLE project (Open Resources: Influence on Learners and Educators) to look at issues surrounding educational resource re-use, including the issue of quality. It turns out that I would be re-using her card game in workshops in Korea, New Zealand and Vietnam as part of this project, and I will be including more findings from these interactions with the re-use card game in upcoming posts.
Reinforcement of the myth surrounding OER quality was not what I was expecting to encounter at an open education conference but I do come across this a lot in the work I do with teacher training at ELT events. I have noticed a discernible pattern whereby a handful of language teachers will say that their role at their institution is to develop resources (often single-handedly) for their programme(s), and where many more teachers will openly declare that they do not consider themselves to be supported or encouraged to develop materials to share across their community of practice. Common claims for not developing and sharing resources beyond classroom handouts include a deficit in technology training and a reliance on in-house materials or proprietary course books that have been selected and or developed by programme managers. These are all valid reasons considering these are all common practices.
In many ways we are trained to consume and not to create resources, and at most we permit ourselves to adapt and supplement often irrespective of intellectual property rights, making it difficult to share beyond institutional and virtual learning environment walls. But can language practitioners and the training and professional bodies that promote current ELT practice continue to shy away from an era of ubiquitous digital content and self-publishing platforms? Going through the motions with course books is a killer so how are we going to support our creative license if all that’s required of us is to consume and regurgitate ready-made ELT skills meals in the form of generic course books? Hopefully the question of bringing language teachers to the realization of their central role as materials developers will be one of the topics on the table at the Materials Development Association (MATSDA) University of Liverpool Conference which I will be attending in April directly after the IATEFL Conference Liverpool 2013.
Less yak and more hack! : rapid prototyping of resources
…it became clear to me that every technology is based upon what I call the orchestration of phenomena, natural effects working together. If you look at any new technology as a whole symphony orchestra of working phenomena, it becomes a huge wonder. I have a sense of wonder far, far greater than I had before. As human beings, we’re using these things unthinkingly every day—it’s like having magic carpets at our disposal, and we have no idea how they fly. Let me add one last thing. I’m an enthusiast about technology, but I am also suspicious of it and what it’s doing to us. It intrudes in our lives, it causes us problems such as climate change, and it’s taken away a lot of our deep connection with nature. But at the same time it’s an incredible wonder. (Interview with W. Brian Arthur, author of The Nature of Technology)
Unless you know what’s at your disposal technologically-speaking and unless you know how to bring resources together, mindful of their affordances and their limitations, then to the untrained eye technological innovation can seem like pure genius. But it’s probably more the case of working through problem solving scenarios step by step, pulling together an ever increasing swag bag of tech goodies to create solutions for the moment until the next thing comes along….and so the cycle continues. This can feel very overwhelming to the individual teacher who would like to be better at using technology and this is why Russell Stannard’s Teacher Training Videos (TTV) is such a big hit among language teachers with bringing what’s out there from the wide world of web-based language resources to teachers.
We now have the technology to flip the course book, the classroom and even higher education with the massive explosion in MOOCs (Massive Open Online Courses) entering the traditional university world with for-profit providers such as Udacity, Coursera and FutureLearn. However, the point I would like to add to this is that resource developers such as those whose web-based language technologies are featured on TTV need feedback on what does and does not work in practice. This is where language teachers can come squarely into the technology equation and learn far more from evaluating and contributing to the development of resources than they would ever pick up at any teacher training continuing professional development session on technology. It dawned on me during my Masters in Edtech and ELT at Manchester University that I wasn’t going to learn much from talking about technology; instead I ended up going directly to the source itself by working with open source software (OSS) developers at the FLAX project.
Hackfests with OSS developers and OER book sprints (see an example of a maths book sprint here) with educators are two rapid prototyping methods for creating code and educational resources. There is no time for hesitancy or hierarchy, you simply work and learn with others to devise shared goals and to bring all that you can to the creation process; to come up with rapid prototypes to share back to the wider community to re-use, re-purpose, re-mix and re-distribute as OER. By attending two Mozilla Drumbeat festivals in Barcelona and London I got to observe and participate in early discussions for the rapid prototyping of Open Badges for educational assessment and Mozilla Popcorn for creating interactive online videos.
While back in New Zealand late last year with the FLAX project team at the Greenstone digital library lab at Waikato, every week I would participate in developer meetings with the computer scientists behind the project and one other English language teacher from the Chinese Open University who is also basing her PhD research on the FLAX project. Well-versed in natural language processing and research on current web-based search behaviour, the computer scientists behind the interface designs of the FLAX collections and activities were adept at exploiting available linguistic resources for the development of simple-to-use language learning collections and OSS text analysis tools. I soon picked up what the limitations of the different technologies and resources were. The focus of these meetings was to develop rapid prototype resources for envisioning and discussing how they could work across different language learning scenarios. I was able to observe and contribute to many iterations of the resources currently under development and I will be bringing these resources to the fore of future blog posts in this series.
I also had a chance to present my work at the Tertiary Writers Network Colloquium which was hosted by the Department of Education at Waikato. This was a great opportunity to share open practices in EAP with a non UK-based audience working mainly in Australasia and in the US. I highlighted some of the OEP going on with the EAP community online using social networking technologies such as Twitter, blogs, Slideshare, YouTube and so on for reflection on the different types of networks we are and are not plugging into. EFL/ESL has been employing these technologies for longer for sharing ideas and resources in general ELT but there is more that could be done with connecting teachers to resources development projects, either through the OSS community or through working with traditional ELT publishers for creating more effective resource evaluation channels that would help teachers learn more about technology.
This would involve the development of resources to engage potential end-users, namely language teachers and students, in the research and development cycle of technology for ELT. In the field of educational technology we refer to this approach as design-based research which Terry Anderson, professor and Canada research chair in distance education, has referred to as action research on steroids (2007). Anderson’s analogy is a useful one as most language teachers are familiar with action research, which shares many of the same principles as design-based research.
Pragmatism is central to both approaches, often employing mixed methods of inquiry to arrive at tangible solutions to educational problems. Normally within action research cycles it is individual teaching practitioners who carry out classroom teaching interventions to observe, record and reflect on the impact of these interventions over time with the aim of informing and improving teaching practice (Reason & Bradbury, 2007). However, within design-based research cycles, emphasis is more commonly placed on educational practitioners working in collaboration with research and design teams (Anderson & Shuttuck, 2012).
Returning to EAP the question remains as to how much we can learn about EAP by talking about it. Quite a bit and I’m all for sharing views about what EAP is as it tries to define itself. What I would like to see beyond yak and competency frameworks like the one from BALEAP that came out in 2008, however, is more in the way of teaching and learning resources from EAP practitioners and evaluations on what works. At this point in time, we’re not yet collaborating with resources development practices across our EAP contexts in any sustainable way. It would be great if we could clone more Andy Gillettes of the Using English for Academic Purposes (UEfAP) website, successfully bringing together genre and corpus-based approaches to EAP resources development. However, it would be even better if instead of creating EAP resources that are open gratis (free to access like UEfAP) we were developing EAP resources that are open libre (free to re-use, re-vise, re-mix and re-distribute), for scaling collaborative open educational resources and practices in EAP as well as in the wider ELT community.
Anderson, T. & Shattuck, J. (2012) Design-Based Research: A Decade of Progress in Education Research. Educational Researcher, Vol 41(1): 16-25
Clifford, J. (1990). Notes on (field)notes. In R. Sanjek (ed.), Fieldnotes: The makings of anthropology (pp. 47–70). Ithaca, NY: Cornell University Press.
Fitzgerald, A. (In press). FLAX Weaving with Oxford Open Educational Resources. Open Educational Resources International Case Study. Commissioned by the Higher Education Academy (HEA) and the Joint Information Systems Committee (JISC), United Kingdom.
Geertz, C. (1973). The interpretation of cultures: selected essays. New York: Basic Books.
Graddol. D. (2006). English Next – why English as a global language may mean the end of ‘English as a Foreign Language’. The British Council: The English Company.
Graves, K, 2003. “Coursebooks.” In D. Nunan (Ed.) Practical English Language Teaching. New York: McGraw-Hill.
LeCompte, M. & Schensul, J. (1999). Analyzing and interpreting ethnographic data. California: AltaMira Press.
Reason, P. & Bradbury, H. (2007) Handbook of Action Research, 2nd Edition. London: Sage.
I confess that I spend most of my time listening to BBC Radio 3. The parallel that I will draw here is that I was never formally educated in classical music in the same way as I have never worked toward formal qualifications in corpus linguistics during any of my studies. Because I am working broadly across the areas of language resources development and enhancing teaching and learning practices through technology it was only a matter of time, however, before I started exploring and toying with corpus-based resources. I met Dr. Shaoqun Wu of the FLAX project while at a conference in Villach, Austria in 2006 and by 2007 I had begun to delve into the world of open-source digital library collections development with the University of Waikato’s Greenstone software, developed and distributed in cooperation with UNESCO, for realising the much broader vision of reaching under-resourced communities around the world with these open technologies and collections.
Bridging Teaching and Language Corpora (TaLC)
Let’s fast forward to the 2012 Teaching and Language Corpora Conference in Warsaw, Poland. Although I have participated in corpus linguistics conferences before, this was my first time to attend the biennial TaLC conference. TaLCers are very much researchers working in the area of corpus linguistics and DDL and this conference was themed around bridging the gap between DDL research and uses for corpus-based resources and practices in language teaching and learning.
One of the keynote addresses from James Thomas, Let’s Marry, called for greater connectedness in pursuing relationships between those working in DDL research and those working in pedagogy and language acquisition. At one point he asked the audience to make a show of hands for those who knew of big names in the ELT world, including Scrivener, Harmer and Thornbury. Only a few raised their hands. He also made the point that these same ELT names don’t make their way into citations for research on DDL. Interestingly, I was tweeting points made in the sessions I attended to relevant EAP and ELT / EFL / ESL communities online without a TaLC conference hashtag. It would’ve been great to have the other TaLCers tweeting along with me, raising questions and noting key take-away points from the conference to engage interested parties who could not make the conference in person and to catalogue a twitterfeed for TaLC that could be searched by anyone via the Internet at a later point in time. It would’ve also been great to record keynote and presentation speakers as webcasts for later viewing. When approached about these issues later, however, the conference organisers did express interest in ways of amplifying their events by building such mechanisms for openness into their next conference.
Prising open corpus linguistics research in Data Driven Learning (DDL)
Problems with accessing and successfully implementing corpus-based resources into language teaching and learning scenarios have been numerous. As I discussed in section 2 of this blog, many of the concordancing tools referred to in the research have been subscription-based proprietary resources (for example, the Wordsmith Tools), most of which have been designed for at least the intermediate-level concordance user in mind. These tools can easily overwhelm language teaching practitioners and their students with the complex processing of raw corpus data that are presented via complex interfaces with too many options for refinement. Mike Scott, the main developer of the Wordsmith Tools has also released a free version of his concordancing suite with less functionality and this would suffice for many language teaching and learning purposes. He attended my presentation on opening up research corpora with open-source text analysis tools and OER and was very open-minded as were the other TaLCers whom I met at the conference regarding new and open approaches for engaging teachers and learners with corpus-based resources.
There are many freely available annotated bibliographies compiled by corpus linguists which you can access on the web for guidance on published research into corpus linguistics. Many researchers working in this area are also putting pre-print versions of their research publications on the web for greater access and dissemination of their work, see Alex Boulton’s online presence for an example of this. Also hinted at earlier in part 2 of this blog are the closed formats many of this published research takes, however, in the form of articles, chapters and the few teaching resources available that are often restricted to and embedded within subscription-only journals or pricey academic monographs. For example, Berglund-Prytz’s ‘Text Analysis by Computer: Using Free Online Resources to Explore Academic Writing’ in 2009 is a great written resource for where to get started with OER for EAP but ironically the journal it is published in, Writing and Pedagogy, is not free. Lancaster University is home to the openly available BNCweb concordancing software which you only need register for to be able to install a free standard copy on your personal computer. A valuable companion resource on BNCweb was published by Peter Lang in 2008 but once again this is not openly accessible to interested readers who cannot afford to buy the book. The great news is that the main TaLC10 organiser, Agnieszka Lenko, has spearheaded openness with this most recent event by trying to secure an Open Access publication for the TaLC10 proceedings papers with Versita publishers in London.
DIY corpora with AntConc in English for Specific Academic Purposes (ESAP)
At TaLC10 I discovered a lot of overlap with Maggie Charles’ work on building DIY corpora with EAP postgraduate students using the AntConc freeware by Laurence Anthony. We had also included workshops on AntConc for students in our OER for EAP cascade at Durham so it was great to see another EAP practitioner working in this way who had gathered data from her on-going work in this area for presentation and discussion at the conference. Many of her students at the University of Oxford Language Centre are working toward dissertation or thesis writing which raises interesting questions around enabling EAP students to become proficient in developing self-study resources for English for Specific Academic Purposes (ESAP). Her recent paper in the English for Specific Purposes Journal (2012) points to AntConc’s flexibility for student use due to it being freeware that can be installed on any personal computer or flash-drive key for portable use. Laurence Anthony’s website also offers a lot of great video training resources for how to use AntConc. The potential that AntConc offers for building select corpora to those students currently pursuing inter-disciplinary studies in higher education is also noted by Charles. Having said this, drawbacks with certain more obscure subject disciplines, for example Egyptology (Ibid.), that had not yet embraced digital research cultures and were still publishing research in predominantly print-based volumes or image-based .pdf files made the development of DIY corpora still beyond the reach of those few students.
Beyond books and podcasts through linking and crowd-sourcing
While presenting on the power of linked resources within the FLAX collections and pushing these outward to wider stakeholder communities through TOETOE, I came across another rapid innovation JISC-funded OER project at the Beyond Books conference at Oxford. The Spindle project, also based at the Learning Technologies Group Oxford, has been exploring linguistic uses for Oxford’s OpenSpires podcasts with work based on open-source automatic transcription tools. Automatic transcription is often accompanied with a high rate of inaccuracy. Spindle has been looking at ways for developing crowd-sourcing web interfaces that would enable English language learners to listen to the podcasts and correct the automatic transcription errors as part of a language learning crowd-sourcing task.
Automatic keyword generation was also carried out in the SPINDLE project on OpenSpires project podcasts, yielding far more accurate results. These keyword lists which can be assigned as metadata tags in digital repositories and channels like iTunesU offer further resource enhancement for making the podcasts more discoverable. Automatically generated keyword lists such as these can also be used for pedagogical purposes with the pre-teaching of vocabulary, for example. The TED500 corpus by Guy Aston which I also came across at TaLC10 is based on the TED talks (ideas worth spreading) which have also been released under creative commons licences and transcribed through crowd-sourcing.
The potential for open linguistic content to be reused, re-purposed and redistributed by third parties globally, provided that they are used in non-commercial ways and are attributed to their creators, offers new and exciting opportunities for corpus developers as well as educational practitioners interested in OER for language learning and teaching.
Anthony, L. (n.d.). Laurence Anthony’s Website: AntConc.
Berglund-Prytz, Y (2009). Text Analysis by Computer: Using Free Online Resources to Explore Academic Writing.Writing and Pedagogy 1(2): 279–302.
British National Corpus, version 3 (BNC XML Edition). 2007. Distributed by Oxford University Computing Services on behalf of the BNC Consortium.
Charles, M. (2012). ‘Proper vocabulary and juicy collocations’: EAP students evaluate do-it-yourself corpus-building. English for Specific Purposes, 31: 93-102.
Lexical Analysis Software & Oxford University Press (1996-2012). Wordsmith Tools.
Hoffmann, S., Evert, S., Smith, N., Lee, D. & Berglund Prytz, Y. (2008). Corpus Linguistics with BNCweb – a Practical Guide. Frankfurt am Main: Peter Lang.
Original, in-house and live, this station brings us what’s new in the world of OER for corpus-based language resources.
Kicking things off in late March with Clare Carr from Durham, we co-presented an OER for EAP corpus-based teacher and learner training cascade project at the Eurocall CMC & Teacher Education Annual Workshop in Bologna, Italy. This was very much a flipped conference whereby draft presentation papers were sent to be read in advance by participants and where the focus was on discussion rather than presentation at the physical event. Russell Stannard of Teacher Training Videos (TTV) was the keynote speaker at this conference and I have been developing some training resources for the FLAX open-source corpus collections which will be ready to go live on TTV soon. New collections in FLAX have opened up the BAWE corpus and have linked this to the BNC, a Google-derived n-gram corpus as well as Wikimedia resources, namely Wikipedia and Wiktionary. These collections in FLAX show what’s cutting edge in the developer world of open corpus-based resources for language learning and teaching.
Focusing on linked resources: which academic vocabulary list?
In a later post, I will be looking at Mark Davies’ new work with Academic Vocabulary Lists based on a 110 million-word academic sub corpus in the Corpus of Contemporary American (COCA) English – moving away from the Academic Word List (AWL) by Coxhead (2000) based on a 3.5 million-word corpus – and his innovative web tools and collections based on the COCA. Once again, Davies’ Word and Phrase project website at Brigham Young University contains a bundle of powerfully linked resources, including a collocational thesaurus which links to other leading research resources such as the on-going lexical database project at Princeton, WordNet.
The open approach to developing non-commercial learning and teaching corpus-based resources in FLAX also shows the commitment to OER at OUCS (including the Oxford Text Archive), where the BAWE and the BNC research corpora are both managed. Click on the image below to visit the BAWE collections in FLAX.
Open eBooks for language learning and teaching
Learning Through Sharing: Open Resources, Open Practices, Open Communication, was the theme of the EuroCALL conference and to follow things up the organisers have released a call for OER in languages for the creation of an open eBook on the same theme. The book will be “a collection of case studies providing practical suggestions for the incorporation of Open Educational Resources (OER) and Practices (OEP), and Open Communication principles to the language classroom and to the initial and continuing development of language teachers.” This open-access e-Book, aimed at practitioners in secondary and tertiary education, will be freely available for download. If you’re interested in submitting a proposal to contribute to this electronic volume, please send in a case study proposal (maximum 500 words) by 15 October 2012 to the co-editors of the publication, Ana Beaven (University of Bologna, Italy), Anna Comas-Quinn (Open University, UK) and Barbara Sawhill (Oberlin College, USA).
MOOC on Open Translation tools and practices
Another learning event which I’ve just picked up from EuroCALL is a pilot Massive Open Online Course in open translation practices being run from the British Open University from 15th October to 7 December 2012 (8 weeks), with the accompanying course website opening on Oct 10th 2012. Visit the “Get involved” tab on the following site: http://www.ot12.org/. “Open translation practices rely on crowd sourcing, and are used for translating open resources such as TED talks and Wikipedia articles, and also in global blogging and citizen media projects such as Global Voices. There are many tools to support Open Translation practices, from Google translation tools to online dictionaries like Wordreference, or translation workflow tools like Transifex.” Some of these tools and practices will be explored in the OT12 MOOC.
Bringing open corpus-based projects to the Open Education community
On the back of the Cambridge 2012 conference: Innovation and Impact – Openly Collaborating to Enhance Education held in April, I’ve been working on another eBook chapter on open corpus-based resources which will be launched very soon at the Open Education conference in Vancouver. The Cambridge 2012 event was jointly hosted in Cambridge, England by the Open Course Ware Consortium (OCWC) and SCORE. Presenting with Terri Edwards from Durham, we covered EAP student and teacher perceptions of training with open corpus-based resources from three projects: FLAX, the Lextutor and AntConc. These three projects vary in terms of openness and the type of resources they are offering. In future posts I will be looking at their work and the communities that form around their resources in more depth. The following video from the conference has captured our presentation and the ensuing discussion at this event to a non-specialist audience who are curious to know how open corpus-based resources can help with the open education vision. Embedding these tools and resources into online and distance education to support the growing number of learners worldwide who wish to access higher education, where the OER and most published research are in English, opens a whole new world of possibilities for open corpus-based resources and EAP practitioners working in this area.
A further video from a panel discussion which I contributed to – an OER kaleidoscope for languages – looks at three further open language resources projects that are currently underway and building momentum here in the UK: OpenLives, LORO, the CommunityCafe. Reference to other established OER projects for languages and the humanities including LanguageBox and the HumBox are also made in this talk.
A world declaration for OER
The World OER congress in June at the UNESCO headquarters in Paris marked ten years since the coining of the term OER in 2002 along with the formal adoption of an OER declaration (click on the image to see the declaration). I’ve included the following quotation from the OER declaration to provide a backdrop to this growing open education movement as it applies to language teaching and learning, highlighting that attribution for original work is commonplace with creative commons licensing.
Emphasizing that the term Open Educational Resources (OER) was coined at UNESCO’s 2002 Forum on OpenCourseWare and designates “teaching, learning and research materials in any medium, digital or otherwise, that reside in the public domain or have been released under an open license that permits no-cost access, use, adaptation and redistribution by others with no or limited restrictions. Open licensing is built within the existing framework of intellectual property rights as defined by relevant international conventions and respects the authorship of the work”.
Wikimedia – why not?
Earlier in September, I volunteered to present at the EduWiki conference in Leicester which was hosted by the Wikimedia UK chapter. Most people are familiar with Wikipedia which is the sixth most visited website in the world. It is but one of many sister projects managed by the Wikimedia Foundation, however, along with others such as Wikiversity, Wiktionary etc.
I will also be blogging soon about widely held misconceptions for uses of Wikipedia in EAP and EFL / ESL while exploring its potentials in writing instruction with reference to some very exciting education projects using Wikipedia around the world. The types of texts that make up Wikipedia alongside many academics’ realisations that they need to be reaching wider audiences with their work through more accessible modes of writing transmission are all issues I will be commenting on in this blog in the very near future.
Presenting the work the FLAX team have done with text mining, incorporating David Milne’s Wikipedia mining tool, the potential of Wikipedia as an open corpus resource in language learning and teaching is evident. I was demonstrating how this Wikipedia corpus has been linked to other research corpora in FLAX, namely the BNC and the BAWE, for the development of corpus-based OER for EFL / ESL and EAP. And, let’s not forget that it’s all for free!
The open approach to corpus resources development
There is no reason why the open approach taken by FLAX cannot be extended to build open corpus-based collections for learning and teaching other modern languages, linking different language versions of Wikipedia to relevant research corpora and resources in the target language. In particular, functionality in the FLAX collections that enable you to compare how language is used differently across a range of corpora, which are further supported by additional resources such as Wiktionary and Roget’s Thesaurus, make for a very powerful language resource. Crowd-sourcing corpus resources through open research and education practices and through the development of open infrastructure for managing and making these resources available is not as far off in the future as we might think. The Common Language Resources and Technology Infrastructure (CLARIN) mission in Europe is a leading success story in the direction currently being taken with corpus-based resources (read more about the recent workshop for CLARIN-D held in Leipzig, Germany).
These past few months I’ve been tuning into a lot of different practitioner events and discussions across a range of educational communities which I feel are of relevance to English language education where uses for corpus-based resources are concerned. There’s something very distinct about the way these different communities are coming together and in the way they are sharing their ideas and outputs. In this post, I will liken their behaviour to different types of radio station broadcast, highlighting differences in communication style and the types of audience (and audience participation) they tend to attract.
I’ve also been re-setting my residential as well as my work stations. No longer at Durham University’s English Language Centre, I’m now London-based and have just set off on a whirlwind adventure for further open educational resources (OER) development and dissemination work with collaborators and stakeholders in a variety of locations around the world. TOETOE is going international and is now being hosted by Oxford University Computing Services (OUCS) in conjunction with the Higher Education Academy (HEA) and the Joint Information Systems Committee (JISC) as part of the UK government-funded OER International programme.
I will also be spreading the word about the newly formed Open Education Special Interest Group (OESIG), the Flexible Language Acquisition (FLAX) open corpus-based language resources project at the University of Waikato, and select research corpora, including the British National Corpus (BNC) and the British Academic Written English (BAWE) corpus, both managed by OUCS, which have been prised open by FLAX and TOETOE for uses in English as a Foreign Language (EFL) – also referred to as English as a Second Language (ESL) in North America – and English for Academic Purposes (EAP). Stay tuned to this blog in the coming months for more insights into open corpus-based English language resources and their uses in different teaching and learning contexts.
This post is what those in the blogging business refer to as a ‘cornerstone’ post as it includes many insights into the past few months of my teaching fellowship in OER with the Support Centre in Open Educational Resources (SCORE) at the Open University in the UK. Many posts within one as it were. This post also provides a road map for taking my project work forward while identifying shorter blogging themes for posts that will follow this one. This particular post will also act as the mother-ship TOETOE post from which subsequent satellite posts will be linked. Please use the red menu hyperlinks in the section below to dip in and out of the four main sections of this blog post series. I have elected to choose this more reflective style of writing through blogging so that my growing understandings in this area are more accessible to unanticipated readers who may stumble upon this blog and hopefully make comments to help me refine my work. Two more formal case studies on my TOETOE project to date will be coming out soon via the HEA and the JISC.
I have also made this hyperlinked post (in five sections) available as a .pdf on Slideshare.
Which station(s) are you listening to?
BBC Radio has been going since 1927. With audiences in the UK, four stations in particular are firm favourites: youth oriented BBC Radio 1 featuring new and contemporary music; BBC Radio 2 with middle of the road music for the more mature audience; high culture and arts oriented BBC Radio 3, and; news and current affairs oriented BBC Radio 4. Of course there are many more stations but these four are very typical of those found around the world. What is more, I’ve selected these four very distinct stations as the basis to build a metaphor around the way four very distinct educational practitioner communities are intersecting with corpus-based language teaching resources. This metaphor will draw on thought waves from the following:
I attended a meeting held at the Open University in the UK at the end of February to discuss the future of open education in the UK. I am a teaching fellow with the Support Centre for Open Resources in Education (SCORE), one of about 400 people working in UK higher education who have been involved in government-funded open educational resources (OER) projects over the last three years. When we all made our applications for funding to the Joint Information Systems Committee (JISC) and the Higher Education Academy (HEA) in the UK we also made the usual commitment in our proposals to sustaining our OER projects after their funded lifetimes. So, what better way to reinforce this commitment than by signing a renewed pledge to Open Education? While the Cape Town Open Education Declaration has been picked up by many organisations around the world we thought it would be a good idea to re-mix this declaration to make it more personalised for the educational practitioner.
What does this all mean for English language teaching practitioners?
Frontrunners for technology-enhanced ELT, Russell Stannard and David Deubelbeiss, have also been pushing for more open educational resources and practices within ELT.
One of the things that interests me most about this post and the comments related to it is the issue of attribution to the original work on automaticity by Gatbonton and Segalowitz. Attribution is essential whether you’re sharing resources in closed teaching and learning environments (e.g. classrooms, password-protected virtual learning environments, workshop and continuing professional development spaces) or through publishing channels using copyright or copyleft licences (e.g. books, research articles, blogs, online forum discussions). There is obviously a great amount of sharing and attribution going on in this discussion and the blogging platform is an enabler for this activity.
What also interests me is the behaviour around resource enhancement. As Scott outlines in the example here, an original resource from a research article by Gatbonton and Segalowitz was re-formatted into a workshop by Stephen Gaies (presumably with attribution to Gatbonton and Segalowitz). This in turn inspired Scott to engage in further resource gathering to inform his teaching practice while applying the five criteria for automaticity, and this further informed the section on fluency in his book, How to Teach Grammar (presumably with attribution to Gaies but now he realises he should’ve included attribution to Gatbonton and Segalowitz). In its latest iteration we find the same criteria for automaticity here in his blog post containing more ideas on how to apply this approach in language learning and teaching from both Scott and his blogpost readers. This is a great example of resource enhancement via re-use and re-mixing, something which the creative commons suite of licences http://creativecommons.org/ allow materials developers and users to do while maintaining full legal attribution rights for the original developer as well as extended rights to the re-mixer of that resource to create new derivative resources.
Legally enabling others to openly re-mix your resources and publish new ones based on them was not possible back in 1988. Arguably, Gatbonton and Segalowitz’s paper with the original criteria on automaticity has stood the test of time because of its enhancement through sharing by Gaies and by the same criteria having been embedded in a further published iteration by Scott in How to Teach Grammar. Times have changed and there is a lot we can now do with digital capabilities for best practice in the use and re-use of resources with attribution still being at the core of the exchange between resource creation and consumption. Except that now with self-publishing and resource sharing platforms, including blogs, it’s a lot easier for all of us to be involved in the resource creation process and to receive attribution for our work in sharing. This coming week, March 5-10, is Open Education Week http://www.openeducationweek.org/ with many great resources on how to openly share your teaching and learning resources along with how to locate, re-use, re-mix and re-distribute with attribution those open educational resources created by others. Why not check it out and see how this activity can apply to ELT?
If you’re new to all of this and have any pesky questions about the business models behind open education, please check out Paul Stacey’s blog, Musings on the Edtech Frontier, with his most recent post on the Economics of Open. Information on what the different Creative Commons and Public Domain licences can be found at CreativeCommons.org.
So, why the interest in British resources for open English?
I’ve been coming in and out of the UK for the past 10 years with my work related to technology-enhanced ELT and EAP. Resources include not only those artifacts that we teach and learn with but also the vibrant communities that come together to share their understandings with peers through open channels of practice. BALEAP, formerly a British organisation (the British Association for Lecturers in English for Academic Purposes) but now with an outreach mandate to become the global forum for EAP practitioners, is such an informal community of practice. Members within BALEAP are actively making up for a deficit in formal EAP training by providing useful resources to both EAP teachers and learners via their website and through lively discussions relevant to current issues in EAP via their mailing list.
Because of my interest in corpus linguistics and data-driven language learning, I’ve also been working with exciting practitioners from the world of computer science, namely those working at the open source digital library software lab, Greenstone, at the University of Waikato in New Zealand, to help with the testing and promotion of their open English language project, FLAX (the Flexible Language Acquisition project). The FLAX team are building open corpora and open tools for text analysis using a combination of both open and proprietary content. A copyrighted reference corpus such as the British National Corpus (BNC) is enhanced within the FLAX project by being linked to different open reference corpora such as a Wikipedia and a Web-derived corpus (released by Google) as well as specialist corpora, including the copyrighted British Academic Written English (BAWE) corpus, developed by Nesi, Gardner, Thompson and Wickens between 2004-2007 and housed within the Oxford Text Archive (OTA).
Oxford University Computing Services (OUCS) manage the OTA along with jointly managing the BNC which is physically housed at the British Library. The OpenSpires project is also based at the OUCS and this is where Oxford podcasts have been made openly available through creative commons licences for use and re-use in learning and teaching beyond the brick-n-mortar that is Oxford’s UK campus. Try out the Credit Crunch and Global Recession OER that are based on an Oxford seminar series and have been enhanced with corpus-based text analysis resources. Or, make your own resources based on these same seminars to share with your own learning and teaching communities. In addition to being housed on the OUCS website these resources, along with many other creative commons-licensed resources from educational institutions around the world, can also be found on the Apple channel, iTunesU.
So, it seems there’s quite a bit going on with open English in the UK that’s worth engaging with, and maybe even making a commitment to sharing with open educational resources and practices.
A finale take-away
Check out FLAX’s new Learning Collocations collection where you can compare collocations for keyword searches and harvest useful phrases to embed into your writing, using the BAWE and the BNC along with corpora derived from Wikipedia and the Web. There are three training videos on how to use the Learning Collocations collection in FLAX available in the Training Videos section of this blog.