Greetings! I have an honors student this year, ------ ------, who is interested in scraping Twitter data for her honors thesis project. The best way to extract these data from Twitter is to use a free tool called T-CAT. The only hitch is that T-CAT needs to be installed on a server in order to run. Would it be possible to run T-CAT on a server here at Muhlenberg?
I love this sort of email. Our small and scrappy college is very different from the R1 Information Science-heavy institutions where I cut my teeth and where requests like this were commonplace. But we at Muhlenberg do a lot with what we have in large part because of our seven year relationship with Reclaim Hosting. And we’re poised to do a lot more in the future with the super flexible, very powerful Reclaim Cloud. You can skip to the tl;dr below, or read on if you like.
DMI TCAT does not run on a cpanel site. The sysadmin commands that need to run at the command prompt are out of bounds on a DoOO site, and there are dependencies that can’t be supported. Moreover, TCAT projects may pull a little of the Twitter stream, or they may pull A LOT! It just depends. So even if we could get it running on a Berg Builds domain, it’s very likely that the bandwidth demands could run afoul and land an account in the penalty box.
The role of our Office of Information Technology responding to requests for support like this is understandably cautious. I like to say that Higher Ed OIT/administrative IT, however it’s organized, is like that tag on a newly purchased pillow. OIT is there to ensure that our pillows aren’t filled with tiny shards of glass, or sawdust, or bedbugs. OIT says, “don’t sleep by an open flame!” It’s their job.
But if OIT is that tag on a pillow, DoOO and DoOO-adjacent technology is the pillow fort. What infuses my work and keeps me going is my eagerness to explore, to try and often flame out, and sometimes to even break things just to see how they work.
We didn’t have Reclaim Cloud when I received that email back in 2016. But that honors student DID ingest hundreds of thousands of tweets about the 2016 election and she developed an outstanding analysis based on a sophisticated use of open source software and APIs. We turned to Digital Ocean, we found some departmental funds to pay for droplets for a couple of months, and in collaboration with a supremely knowledgeable professor we got the systems administration work done to get DMI TCAT running. The student jumped in, learned the interface, registered the Twitter APIs, and got to work.
Today, DMI TCAT is dockerized, and much of this 2016-era work would be offloaded and accomplished much more quickly. And the hassle of scraping together departmental money (or even passing the tab to a student) can also be a thing of the past. How? Reclaim Cloud is how.
Reclaim Cloud is a service provided by Reclaim Hosting that takes the idea of a Domain of One’s Own to another level. A cpanel-based DoOO account presumes a specific stack (by that, I mean a set combination of server OS, web server, database, and middleware programming language) and for that trade-off, we receive a hardened, easily transportable, low barrier environment. But when projects arise that require Node.js and NGINX, for instance, or python, or ruby on rails, or any of a growing number of web frameworks, then a conventional DoOO doesn’t meet the occasion, unfortunately.
Where bandwidth demands (the amount of data flowing through a site) might be large, a DoOO account is designed (on purpose) to fail in a way that preserves many websites at an institution at the expense of one getting slammed. Put another way, not all web hosting is equal. Most sites have modest numbers of visitors. Most sites don’t livestream media. And most don’t need redundancy — if it goes down for an hour or two it’s likely not a huge deal. But if a project does require these more intensive qualities, Reclaim Cloud can be the answer.
In the case of the student project I mention, we really didn’t know what to expect from the Twitter stream, and we wanted to put some parameters on the project. Specifically, we wanted to turn off the money tap if the amount of Twitter data became so large as to run up a huge Digital Ocean bill. Reclaim Cloud makes this sort of configuration a very easy thing to do. And it also means that if a project launches that is really, really successful and has the finances in place, it shouldn’t go down under a heavy load. This idea of scalability is an important one, and something a conventional OIT department at a higher ed institution will likely drop like a hot potato.
Similarly, suppose this Twitter-based election project was designed to be repeated every two years. Reclaim Cloud makes it really easy to mothball web-based stuff by turning it on and off. You only pay when a project is running (like during a major election year) but not pay when it’s not needed. At a small college like mine, where we very much conform to the academic calendar, this means we could spin down various digital tools when they aren’t needed, and potentially save a lot of money.
Let suppose for a second that this honors student wants to continue her research at her new graduate school. Reclaim Cloud makes that super easy, too. These projects are easy to backup, clone, transfer from place to place or account to account. They permit collaboration at the systems administration level AND at the application level. This may be a tough thing to visualize, but it is a HUGE benefit for a person like myself who often can get a complex bit of software very close, but maybe still need some help with that last bit of configuration or modification. Whether it’s the helpful folks at Reclaim or a colleague at my institution or elsewhere, it’s possible to for two or more people to have administrative-level access to virtual environments or application servers. For student, or even for faculty projects, this would likely be a total impossibility for our OIT, and I totally understand why.
Now lets suppose two years pass and my professor colleague wants to dust off DMI TCAT for the next iteration of a course he teaches every two years. But the open source DMI TCAT project has gone through a lot of upgrades and changes. Typically, this is a big headache, but a containerized project means that with a few commands (typically documented on a site like github) that professor can get things updated. This means all the dependencies and special libraries will very likely update, as well. If there are hiccups, detailed logs are available without emailing anyone or asking for new sets of permissions to new directories.
I absolutely hate to say no to requests for assistance, and this goes double when there are students involved. But I also know my limits, so anything I agree to work on has to be fairly sustainable and can’t end up a full-time commitment. A utility (or cloud) platform, from where I sit, allows me to do some cool and important things in ways that don’t over-commit me or my colleagues.
So, in summary, here are just of few of my responses to the question, “Why Reclaim Cloud?”
- experiment and check the feasibility of stuff. It’s a lot easier to respond to a working thing than the idea of a thing. Spinning it up myself means I can have something real for a colleague or a student to evaluate. Reclaim Cloud has a dead simple installer for anything in their marketplace, essentially like how Installatron works within cpanel. This goes for dozens of applications and platforms whose number is growing all the time. And the community can help get things into this marketplace, as well.
- run it while needed then pause it. Or mothball it. The potential here is significant. With some careful analysis, a lot of ed-tech and administrative systems could be, if not entirely paused, at least tuned to utilize fewer utility processes (called “cloudlets”) during summers and other lag times. Or, to simply turn things off but not kill them entirely could have a lot of potential for cost saving, security, etc.
- esoteric = no problem. Ever had a turf war over Linux environments? Like, you need ubuntu but they only run CentOS? You never need to have those meetings ever again. Use Docker or build it from scratch. It will run on Reclaim Cloud or I’ll eat a beet (and I HATE beets).
- encourage more F/LOSS; encourage more self-hosting. This is an easy leap from DoOO. Students can develop their own digital literacy and competency while relying less on extractive hosted platforms of all kinds. Ghost, Peertube, Mattermost, Jitsi, and LimeSurvey are a few already in the marketplace, but there are dozens and dozens of mature open source applications, containerized or not, that are possibilities.
- portability. Whether it’s cloning from project to project, or handing it off year to year and colleague to colleague, or — my favorite — students taking their work with them after graduation, the Reclaim Cloud infrastructure is slick and simple.
- collaboration. I mentioned this already, but the ability to collaborate at various levels of administrative access within the Reclaim Cloud dashboard has seriously cool possibilities. Just a few to think about: cross-institutional research; faculty-student co-creation; working first-hand with the developers and maintainers of open source projects; school-community collaboration (e.g., citizen science, public health, open data).
- simplified financials. Grant-funded projects and institutional infrastructures and budgets often clash. Paying on an annual cycle can really complicate small time-bound projects. The pricing and the pay-as-you-go flexibility of Reclaim Cloud cuts through a lot of institutional red tape. But invoicing and auditing capabilities are there for tracking even the most onerous accounting requirements. Whether it’s big grants or small student budgets, no problem. Pay-as-you-go. Auto-renewal. Built-out business rules (for example: spin this down if it costs too much per day). Reclaim Cloud can turn a lot of “no” answers into “oh yeah, we can do that” ones.
- An active community with porous boundaries between the various development communities and the Reclaim community, itself. This may seem like a simple thing, but in my experience it’s rare and pretty special. Centering a notion of self-hosting and DIY means vendor lock-in just isn’t how it’s done. So open source project managers, Reclaim Cloud customers, and Reclaim Hosting folks work as collaborators to get things done and figure things out.
I wish I had more time to dedicate to building and exploring on Reclaim Cloud. To date, I’ve done a lot of taking, of time and patience, and I’ve not had many chances to give back by way of documentation, building installers for the marketplace, or sharing with others. But I think about the possibilities a lot, and the hours I have logged have confirmed one thing: there is a lot of potential. More than anything, I want the students and educators at my college to have access to online spaces of exploration. And I insist that these spaces won’t confound them or exploit them. So I’m hopeful and excited about Reclaim Cloud and I’m determined to learn more so that when I get those awesome emails ….