Day 4 with GPT-2

Throughout the last few days I have been devoting all my spare time to learning about and working with the GPT-2 model from OpenAI. They published a paper about the model and it makes for an interesting read. The more interesting part of the equation is actually working with the model and trying to understand how it was constructed and working with all the moving parts. My first efforts were to install it locally on my Windows 10 box. Every time I do that I always think it would have been easier to manage in Ubuntu, but that would be less of a challenge. I figured giving Windows 10 a chance would be a fun part of the adventure. Giving up on Windows has been getting easier and easier. I actually ran Ubuntu Studio as my main operating system for a while with no real problems. 

https://openai.com/blog/better-language-models/

Day 3 with GPT-2

My training data set for my big GPT-2 adventure is everything published on my weblog. That includes about 20 years of content that spans. The local copy of the original Microsoft Word document with all the formatting was 217,918 kilobytes whereas the text document version dropped all the way down to 3,958 kilobytes. I did go and manually open the text document version to make sure it was still readable and structured content.

The first problem is probably easily solved and it related to a missing module named “numpy”

PS F:\GPT-2\gpt-2-finetuning> python encode.py nlindahl.txt nlindahl.npz
Traceback (most recent call last):
File “encode.py”, line 7, in
import numpy as np
ModuleNotFoundError: No module named ‘numpy’
PS F:\GPT-2\gpt-2-finetuning>

Resolving that required a simple “pip install numpy” in PowerShell. That got me all the way to line 10 in the encode.py file. Where this new error occurred:

PS F:\GPT-2\gpt-2-finetuning> python encode.py nlindahl.txt nlindahl.npz
Traceback (most recent call last):
File “encode.py”, line 10, in
from load_dataset import load_dataset
File “F:\GPT-2\gpt-2-finetuning\load_dataset.py”, line 4, in
import tensorflow as tf
ModuleNotFoundError: No module named ‘tensorflow’

Solving this one required a similar method in PowerShell “pip install –upgrade pip install https://storage.googleapis.com/tensorflow/mac/cpu/tensorflow-1.8.0-py3-none-any.whl” that also included a specific path to tell is where to get TensorFlow.

I gave up on that path and went a different route…

https://github.com/openai/gpt-2/blob/master/DEVELOPERS.md

and

https://colab.research.google.com/github/ilopezfr/gpt-2/blob/master/gpt-2-playground_.ipynb

A second day of working with GPT-2

Getting the GPT-2 model setup on this Windows 10 machine was not as straightforward as I had hoped it would be yesterday. Python got upgraded, Cuda got upgraded, cuDNN got installed, and some flavor of the C++ build tools got installed on this machine. Normally when I elect to work with TensorFlow I boot into an Ubuntu instance instead of trying to work with Windows. That is where I am more proficient at managing and working with installations and things. I’m also a lot more willing to destroy my Ubuntu installation and spin up another one to start whatever installation steps I was working on again from the start in a clean environment. My Windows installation here has all sorts of things installed on it and some of them were in conflict or something with my efforts to get GPT-2 running. In fairness to my efforts yesterday, I only had a very limited amount of time after work to figure it all out. Time ran out and installation had occurred via the steps on GitHub, but no magic was happening. Time ran out and that was a truly disappointing scenario to have happened. 

Interrupted. School.

Predicting my next post

Yesterday, I started looking around at all the content I have online. The only base I do not have covered is probably needing to share a new speaking engagement photo online. I need to set up a page for speaking engagements at some point with that photo and a few instructions on how best to request my engagement.  Every time I have done a speaking engagement my weblog and Twitter traffic picked up for a little bit. Using the “Print My Blog” plugin I was able to export 1,328 pages of content for a backup yesterday. My initial reaction to that was wondering how many pages of that were useful content and how much of it was muddled prose. Not only did that question of usefulness make me wonder, but also I wondered if I loaded that file into the OpenAI GPT-2 what would come out as the predicted next batch of writing. That is probably enough content to spit out something that reasonably resembles my writing. I started to wonder if the output would be more akin to my better work or my lesser work. Given that most of my writing is somewhat iterative and I build on topics and themes the GPT-2 model might very well be able to generate a weblog in my style of writing. 

Just for fun I’m going to try to install and run that little project. When that model got released I spent a lot of time thinking about it, but did not put it to practice. Nothing would be more personal than having it generally create the same thing that I tend to generate on a daily basis. A controlled experiment would be to set it up and let it produce content each day and compare what I produce during my morning writing session to what it spits out as the predicted next batch of prose. It would have the advantage or disadvantage of being able to review 1,328 pages and predict what is coming next. My honest guess on that one is that the last 90 days are probably more informative for prediction than the last 10 years of content. However, that might not be accurate based on how the generative model works. All that content might very well help fuel the right parameters to generate that next best word selection. I had written “choice” to end that last sentence, but that felt weird to write that the GPT-2 model was making a choice so I modified the sentence to end with selection.

Interrupted. School.

These are strange and different times

Returning to form or so it goes takes a bit of effort. Any return to form without effort would inherently discount the journey. Shifting back without a bit of effort might just be acceptable right now. These are strange and different times. This will be the 9th day in a row of posting something to the weblog. That streak is starting to feel a little bit more normal. Every day my thoughts have started to get back into a more orderly form that can be turned quickly into prose. That is the key element in turning the corner and engaging in a bit of writing. Not only is clearing your mind enough to not do anything is a skill, but also allowing your stream of consciousness to spill out onto the screen as prose is also a skill. Transforming thoughts almost directly into keystrokes in an effortless way is the hallmark of being in the writing pocket and that feels like something that happens from practice. 

This weekend I have spent a lot of time thinking about what exactly election data can tell us about the state of civil society and the general degree of civility at large. Within the world of an election the universe being examined could be all voters or it could be all people that could vote. Some of the best insights available could be about the people who take no action and choose to sit out of the election process. My first response to that investigation into that phenomenon was a simple series of thoughts about how maybe they did not know it was election day. It is entirely possible for a lot of people that government is a thing that stands separate from the routines of daily life and voting by proxy stands separately from everyday life. Certainly some places have moved to mail in ballots and have made it much easier to vote. Other places have gone the other way and made it much harder to participate in the voting process. Now we are starting to get somewhere in the analysis. Three potential reasons have jumped out: 1) being unaware, 2) easily ignored, or 3) it was very hard. That set of thoughts certainly expresses a continuum of sorts that could be expressed as some kind of Likert scale.   

My initial analysis has started at the congressional district level. My assumption is that I can reasonably roll up my congressional based model to the state level and use a bit of a convoluted transform to get to a national outcome. Within a national election model just using the general sentiment would express a popular vote based outcome and that would not work all the time. Sometimes it would yield the correct result, but other times it might yield a false positive within a condition where having the most votes at a national level is not aligned to the outcome. That is a scenario that political scientists will be writing about for years to come. Social scientists in general will be studying that and how it influences both civility and civil society for decades. Seriously, that is not an understatement. Our beliefs in how democracy functions are a very important part of how we engage in a social contract to participate in the normative routines that allow daily life to function as well as it does. Maybe this watershed event that is occurring now will create some type of shared experience that will help people better relate to each other, strengthening the very social fabric that protects democracy. 

I’m really starting to think that these are truly strange and different times. The lens in which we see the world and how we interact with things is changing every day as we experience a new normal way to interact with people. A new normal way to visit stores to go about the routines that allow daily life to occur. We all have to figure out how to make meals on a daily basis. Eating is a shared and common experience across all of humanity. It is one of those things that should be a commonly shared experience like voting for those that have reached a certain age. Outside of politics people generally do a lot of similar things every day. All those things could be modeled and sentiment analysis could be produced to figure out preferences based on those things. Somewhere inside of that universe of possible analysis a small slice of things exist that my research is focusing on right now. That is where my research is dialing into understanding voter sentiment and preference within elections. At this point, I’m very focused on the key factor of participation and the sentiment around why a large portion of voters are opting out of the process that literally guarantees the stability of our daily routines. 

Given that I’m on my 9th day in a row of posting it might be a good time to mention that most of my writing is created and posted without any real editing or revision. My routine is generally to sit down and write until the writing is done and then post it online before starting a new session of writing. My time is not normally spent on the same passage of prose engaging in rework and editing to produce a perfect product. What you are reading right now is really just how the thoughts translated from my head to the keyboard. For better or worse that is generally how this weblog works and how prose is created to be published here. Some of it is grammatically correct and free of atrocious typos and some of it is very clearly not clear at all and free of errors. One of the things that I do a lot is leave out a word that otherwise brings the flow of a sentence together. Some of that is just a weird thinking and typing problem where a word gets left out of a sentence. If you want back and read it, then you would immediately notice it and fill in the missing word. Most of the time that does not impact the meaning of what is being presented it just creates a less than ideal situation for the reader who is wondering why proofreading was set aside or ignored. Please don’t wonder about it. I just elected not to spend my time editing the prose being created. Yeah —- that is questionable. 

Sometimes I wonder if maybe every Sunday I should swing back and edit the last 7 days of work and just leave a note at the end of the post that it was edited. Most of the time that thought occurs and is discarded. You can tell that analysis about discarding editing is accurate based on scrolling back a day, a week, or even a month to see it was not implemented. For the most part that type of effort is probably not going to be an active part of my routines. If it has not taken root in the last 20 years, then it is unlikely to start happening without some real effort to change my routine. As an analog to that lack of action on the editing front, trying to figure out why citizen participation in elections has been gradually declining is probably similar. It is something I could do with a little time and effort, but I just elect not to do it over and over again. You can kind of get a feel for where my head is at the moment and what is at the forefront of my considerations as I dive into this area of analysis.