Monday, August 14, 2017

Migrating multi-project Subversion repo to Git

I've been recording my Subversion to Git journeys to remind myself for the next Subversion conversion. Here are my prior entries in case they're helpful:
Simple Sourceforget repo from Subversion to Git
Self-hosted Git server

I had saved one remaining Subversion project for the last to migrate to Git because of its relative complexity in structure. It's actually one of my first projects, the Text Trix text editor, and I had migrated it from CVS to Subversion way back in the day using svn2git.

Here is the basic structure:
root
|--texttrix
   |--trunk/branches/tags
|--jsyntaxpanettx
   |--trunk/branches/tags|--osterttx
   |--trunk/branches/tags|--plugins
   |--plugin0
      |--trunk/branches/tags
   |--plugin1
      |--...|--txtrx
   |--trunk/branches/tags
When I gave svn2git the root level URL, svn2git only tracked changes when the repo had a single project, in the trunk/branches/tags format. After I had expanded the repo to include multiple projects, each with its own trunk/branches/tags organization, svn2git stopped tracking the folders further. To include the full history, I pointed svn2git to the main project only, and svn2git was able to track the project back from when it had moved there. I also excluded any .class, .jar, and .lex files since these can be re-generated.

svn2git https://svn.code.sf.net/p/texttrix/svn/texttrix --authors ~/authors-ttx.txt --verbose --exclude ".*.class$" --exclude ".*.jar$" --exclude ".*.lex$"

For some reason many files to exclude leaked through, so I eventually went to the BFG cleaner to remove these files, with some helpful hints from this guide. I first needed to initialize a bare git repo to "remotely" host my subversion-imported repo:

mkdir ttxhost
cd ttxhost
git init --bare
cd ../texttrix
git remote add origin ../ttxhost

This "remote" host is actually local but serves as a clean repository from which BFG can pull and push. But before I actually ran BFG, I needed to make sure that my commit at HEAD did not have any files that I intended to remove since BFG leaves the HEAD commit alone.

git rm lib/*.jar
git commit -m "removing remaining jars"
git push origin master

Next I cloned this host repo and allowed BFG to work its magic. I had some difficulty finding the syntax for removing multiple files from the main docs but eventually found this Stackoverflow comment.

cd ..
mkdir ttxclean
cd ttxclean
git clone --mirror ../ttxhost
java -jar ../bfg-1.12.15.jar --delete-files "{*.class,*.jar,*.lex}" ttxhost.git

BFG reported that it had successfully cleaned files, and after summing the sizes of these files, I realized that BFB had removed about 80% of the original size of my repo. Inspected the output carefully turned out to be important as I had learned here about the need to remove files from the HEAD commit.

To get all these changes onto the host repo, I needed to clean the repo and push its changes:

cd ttxhost.git
git reflog expire --expire=now --all && git gc --prune=now --aggressive
git push

When I went back to the host repo, however, it remained the same size. And running "git gc" made it even bigger! According to this post, this behavior is known for git gc, reflecting a safety mechanism to keep unreferenced objects for 2 weeks in an unpacked form. Using the "prune" flag greatly reduced the size:

cd ../../ttxhost
git gc --prune=now

And voila, the host repo went from 9.1 to 3MB! I next checked out a fresh copy of this repo to check it:

cd ..
mv texttrix texttrix.old
git clone ttxhost texttrix

The new repo was clean and ready for upload to GitHub! Following this guide, I cloned a bare copy of the host repo and mirrored it to GitHub to upload all references.

mkdir ttxforgithub
cd ttxforgithub
git clone --bare ../ttxhost
cd ttxhost.git
git push --mirror https://github.com/the4thchild/texttrix.git

I removed my prior local repo and cloned in the one from GitHub:

cd ../..
rm -rf texttrix
git clone https://github.com/the4thchild/texttrix.git

And here it is on GitHub. And now I have to repeat all over with each of the remaining projects and plugins.

Sunday, December 18, 2016

Trying to make sense of Starkiller base's demise

Rogue One felt like an old friend I never knew I missed. As several reviews have mentioned, many have considered it the best Star Wars film since the original trilogy, a true hearkening back to the films we grew to know so well and lovr. But Force Awakens came out last year, and wasn't it already quite a resounding success, a return to original trilogy form?

I certainly thought so, and that was why I bought the digital film as soon as it became available. I had little difficulty explaining away most of the plot holes, at least to myself, but one hole kept nagging at me. How did the Resistance take down the Starkiller base with such apparent ease? Or conversely, why did the First Order make their prized weapon so vulnerable, and especially in such a way as to replicate the vulnerability of the two previous Imperial death planets?

Perhaps one key to reconciling this apparently fatal flaws lies in another potential plot hole. Another criticism raised against the Starkiller design is that its attachment to a planet greatly limits its mobility. Unless it could somehow up and move its entire existing planet, the weapon could really only target nearby planets within its fiery canon's range.

Either this design was an incredible overlooked weakness, or it underlies the fact that Starkiller was not an end unto itself. If it could only target planets within a local radius or its orbital trajectory, perhaps the weapon was but a prototype, or at least a first of its class, one of many such weapons to come. We might conjecture that Starkiller was merely a new type of weapon that would be installed on many additional planets, each targeting local New Republic planets within reach.

As one of many, Starkiller would not be the end all, the prized possession which the First Order would defend at any cost. The base would be important, no doubt, but also important would be the ability to construct multiple such bases efficiently and within cost. Even as formidable as a Star Destroyer might be, for example, each ship would have to "cut corners" in the name of reasonable construction costs and time to make the construction of any similar designs feasible.

Similarly, Starkiller would have to work within its design constraints, which meant relying on less than impregnable defenses. And as the first of many similar designs, a 1.0 effort shall we say, of a weapon that harnesses an inherently unstable energy source, the system remained only vulnerable to demise with a critical perturbation of its inner workings.

The Resistance took advantage of this vulnerability and killed the Starkiller base, but their celebration perhaps underlies their own naive vulnerability. The base had destroyed its intended targets -- multiple surrounding New Republic planets -- and many more Starkiller bases are perhaps to come, each situated near additional ripe Republic targets. And if the First Order can in fact learn from mistakes, then these new Starkiller 2.0 bases will be do away with the trench-run-critical-vulnerability once and for all. Or so we hope, at least for the sake of believability.

What made Rogue One feel so fresh was not having to go through such mental exercises to justify the plotline. But hey, there are so many much more skilled people doing the creative heavy lifting to put these movies together in the first place, so props to them for painting the universe we so thoroughly debate and enjoy.

Monday, September 05, 2016

Another try at Matplotlib

Awhile back, I finally managed to install Matplotlib (and Mayavi) through general package management tools on the Mac. Through a combination of Homebrew and Pip and a lot of finagling, my take-3-or-so try at getting the Python graphing libraries to work actually came to fruition.

Some recent feedback on the tXtFL AI prompted me to revisit the modeling software, but alas, Matplotlib was no longer working on my system. When I tried to run my scripts, I got a "Fatal Python error: PyThreadState_Get: no current thread. Abort trap: 6" error. As far as I can tell, the error tracks back to a bug in the latest version of vtk included with Homebrew.

What was odd to me was that the Virtualenv technique I had used was supposed to isolate the dependency chain so that the packages would continue to work even if some of them got updated and broke compatibility with other packages. I realized, however, that the Homebrew package installations take place outside of the virtual environment, so at least the way that I've been using Virtualenv probably does not fully isolate the environment.

Not being able to run a modeler just because of a broken dependency is of course a real bummer, so I set out to find alternative solutions. I've been reluctant to try commercial offerings in hopes of sticking with generalized package management solutions such as Homebrew, but I came across an at least partially open source solution called Anaconda. The trouble with installed Matplotlib from scratch is the sheer number of dependencies (no less than "pycairo, PyQt4, PyQt5, PySide, wxPython, PyGTK, Tornado, or GhostScript" listed on the official installation page), which makes having a dedicated package manager via  Anaconda for these and other Python scientific computing dependencies make perfect sense.

Anaconda offers the option of a complete install ("Anaconda") vs a minimalist install ("Miniconda"), and I went for the Miniconda install to see if I could get by with it. After installing the 64-bit Windows version, I created a separate virtual environment inside it with the Matplotlib and Mayavi libraries as follows:

conda create --name pystat matplotlib mayavi

Apparently including all required packages at environment creation time is preferable. Anaconda pulled in all the other required dependencies and set up the virtual environment. All of these packages apparently are available for new virtual environments as well, which streamlines future downloading.

I'm used to using Cygwin for all my Bash-within-Windows computing, but Anaconda and Cygwin do not appear to always play nicely with each other. I tried at activate the virtual environment within Cygwin, but the detected python appeared to be from Cygwin rather than from Anaconda.

Instead, I dropped into an old-fashioned Windows command prompt and fired up Anaconda from there. A simple "activate pystat" command brought me into the correct environment. I had to learn though that "cd d:\src" does not automatically bring one into that folder, but apparently typing "d:" as a separate command first is required. Clearly I'm a Windows command prompt newbie.

From there, both matplotlib and mayavi worked out of the box. I didn't even need to use the custom launch script that I had used previously. The beauty of Anaconda it is cross-platform, so I next plan to see if it will work as seamlessly back on my Mac as it has on Windows. Or maybe by then, vtk with brewed python will be friend again.

Sunday, June 26, 2016

Crowdsourced Prayer

I've always wondered why we should ask others to pray for us if God hears the prayers even of one--and especially if he knows our needs even before we ask of him. Having others pray for us might seem to be redundant, or if we were to take it to the extreme, even faithless.

One Scripture that comes to mind regarding the prayers of many is from Revelation, where the multitude if prayer is likened to incense being poured out to God. There is no indication that these prayers somehow grab his attention better or that they might get lost otherwise, but rather that just as incense is a way of worship, so also the prayers of many combine to form collective praise to our God.

Asking people to pray for us may then be less about maximizing the chances that God might answer our prayer and instead more about creating an opportunity where more people might worship God. And how does that happen through prayer? By asking others to enjoin in our requests, we give opportunity for more people to see how God answers our requests. By crowdsourcing our prayer, we collectively reap the rewards. Our concerns become others' concerns, and likewise our answered prayers become others' answered prayers.

Wednesday, August 19, 2015

Troubleshooting SELinux

As much as I hoped to make the greatest use of SELinux to secure my servers, I've typically dropped it into "permissive" mode after encountering cryptic security restrictions. I recently set up a basic Fedora server on Digital Ocean including SELinux and decided to try sticking it out with SELinux in full protection mode.

And as usual, I encountered a cryptic "failed to start" error while reconfiguring an Apache server. Thanks to a comment on one of the Fedora forums, I found a convenient tool, audit2why, to help decipher the error message. Piping the error output into audit2why (eg systemctl status httpd.service|audit2why), the tool actually gave me the specific command to adjust the setting in SELinux.

Now we can have peace of mind and peace of configuration too!

Thursday, December 11, 2014

Migrating Git repo from Eclipse to Android Studio

I've been tracking the Android Studio releases from afar for awhile and have been looking for an excuse to truly give it a shot. Besides the fact that Java 7 for IntelliJ wasn't really supported on the Mac for some time, the transition from the Eclipse build system to Gradle seemed a bit invasive.

One of the biggest barriers in my mind was migrating my Git repository to the new Gradle structure without losing file history. Android Studio will migrate an Eclipse project to an Android Studio, Gradle-based one, which involved moving a number of folders as well as replacing Eclipse- with Gradle-specific configuration files. I wanted to use the Android Studio migration tool but was unsure whether Git would identify these changes as file moves or treat them as entirely new files, which would make tracking file history more cumbersome.

Some have found that Git will identify these files as simply moved/renamed, but unfortunately I did not experience that. Instead, I found that I could run the migration tool in a separate location and manually recreate the moves on the original repository. Here's how I did that, in case anyone else is trying the make the move as well.
  1. Create a new branch in the Eclipse project location. This will make it easier to delete any changes if migration gets messy.
  2. Start the Android Studio migration tool. To get it running, I closed any open Android Studio project, and the wizard popped up. I then chose to import a non-Android Studio project and pointed to the Eclipse project location. For the new project, I chose a distinct outside folder location.
  3. Migrate the project! There were some errors upon actually building the project, but at least for me the migration itself went smoothly.
  4. Mirror the major folder moves. Back in the original Eclipse project location, I started moving folders via git to mirror the new structure. The key is that there are really only a few folders to move, so it's not as painful as it looks. Here are the main commands I ran, with help from this post:
    mkdir -p app/src/main/java
    git mv src/com app/src/main/java
    git mv res app/src/main
    git mv assets app/src/main
    git mv AndroidManifest.xml app/src/main

    "src/org" can be changed to whatever your source sub-root folder is, and "assets" or other folders may need to be in/excluded according to your project files.
  5. Commit these changes in the new branch. Now they are ready to be picked up in your new Gradle-based project!
  6. Copy the Eclipse project's .git folder into the new Gradle-based project location. This will make the Git repo with all the major moves available in your Gradle location, including the new branch where all the dirty work is being done.
  7. Turn on VCS for the new project. Android Studio has a dedicated VCS menu with the option to turn on various versioning systems. Once it's activated, you can show changes and confirm that Git sees all those moved files as being in the right place.
  8. Add all the new Gradle-based files. The nice thing about using the Android Studio VCS is that it knows not to show a bunch of Gradle build files that shouldn't be version controlled. It also deletes a bunch of Eclipse-specific project files from Git simply because they weren't migrated. The challenge however is that it doesn't by default offer to add all the files you do in fact need. Here are the files that appear to be key, which I learned from viewing the Android Samples project structure:
    app/build.gradle
    gradle
    build.gradle
    gradlew
    gradlew.bat
    settings.gradle
  9. Commit all changes to Git. If you've pushed them, you should be able to pull them on another machine right from within Android Studio to access your project there.
From my brief taste of Android Studio so far, it's been quite a treat!