At some point several weeks ago, I came across Bret Victor‘s presentation “Inventing on Principle”. In it, among many amazing prototypes, Bret demonstrates a little app in which he can write code and see the live running results next to the code (starts at 16:47).
Bret Victor – Inventing on Principle from CUSEC on Vimeo.
After seeing this little app, my first thought was simply that this was genius; the next was “I want this for C#”. Now, this idea demos really well and is fairly simple to prototype, but I find there’s a lot of problems you start running into when you try to make an actual tool out of it; so I started a research project currently called “Instant”. The plan is to blog about the evolution of the project as I work towards making something real out of it, including what problems I run into and how I solve them. This will serve as not only a running set of examples on using Roslyn, but hopefully an interesting exploration of the space. We’ll start by matching Bret Victor’s demo, and then take it from there (hopefully to a Visual Studio and MonoDevelop plugin). It’s worth noting that I have absolutely no experience with compilers, runtime internals, or any of that sort of thing that you might expect are necessary for this. As a result, everything is very much a work in progress.
Here’s a working prototype roughly matching Bret’s demo:
In this demo you can see I implement binary search and I can see the results of the code execution on the right. When I change the search parameter I can instantly see that there’s a bug, so I fix the loop and see the new behavior immediately. So how did I do this?
The first thing we’ll need is a data structure of the code itself. I knew about Roslyn already, and while I realized the importance of the project, I never really personally had a need for it – until now. What I knew going in was that Roslyn was going to give me an object representation of the code. What I didn’t know is that they have a ScriptEngine class, and it rocks. My initial thought was that I would lift all the local variables from a single test method, run each statement one at a time through the script engine, and then read from the hosting class I’d created what the current variable states where. So I dug in and wrote a SyntaxRewriter that did the lifting for me, and then it occurred to me how much I was over-complicating things.
With the ScriptEngine class, you can execute code in the context of a Session. Session allows you the ability to supply a “host object”, which you initialize and then pass to the session. In the code of your “script”, you can call methods on this host object as if the code you’re running was just another method on the same class. This, combined with the ability to rewrite the class gives us a simple solution: Rewrite the code to include calls to methods that log the values. The solution that I felt covered the most scenarios on it’s own was to replace all assignments with a call to a logging method that returns the object.
Basically, turn:
int x = 5;
into:
int x = LogObject ("x", 5);
We’ll use two classes to get there, one to act as the rewriter, the other to act has the host object (or “logger”). The logger, as you might imagine, is quite simple:
public class Logger { public string Log { get { return this.builder.ToString(); } } public T LogObject<T> (string name, T value) { this.builder.AppendLine (name + " = " + value); return value; } private StringBuilder builder = new StringBuilder(); }
Roslyn provides us with a SyntaxRewriter class that uses the visitor pattern to make it very simple to get up and running rewriting code. If you’ve ever used ExpressionVisitor implementing a LINQ provider, you should feel right at home. We’re given us two ways to construct new syntax: 1) Use static methods and pass in object representations of all of the elements making up the syntax or 2) Pass in a string representing just the piece of syntax we want and have Roslyn parse it for us. #1 is frankly a bit cumbersome currently, so we’ll go with #2.
So, for our previous example, we want to go from a variable name and value to a log call for them:
internal class LoggingRewriter : SyntaxRewriter { private ExpressionSyntax GetLogExpression (string name, SyntaxNode value) { return GetLogExpression (name, value.ToString()); } private ExpressionSyntax GetLogExpression (string name, string value) { return Syntax.ParseExpression ("LogObject (\"" + (name ?? "null") + "\", " + value + ")"); } }
Now we need to actually use it, so we’ll start with variable declarations (which are not the same as assignments):
protected override SyntaxNode VisitVariableDeclarator (VariableDeclaratorSyntax node) { // We don't care about: int x; if (node.InitializerOpt == null) return base.VisitVariableDeclarator (node); EqualsValueClauseSyntax equals = node.InitializerOpt; // Get a log expression using the variable name and the value it's being assigned. ExpressionSyntax logExpression = GetLogExpression (node.Identifier.ValueText, equals.Value); // Update the immutable syntax object with value expression equals = equals.Update (equals.EqualsToken, logExpression); // Update the declaration with our new initialization expression return node.Update (node.Identifier, null, equals); }
Now that we have a simple rewriter and our logger, let’s actually see it work. I’ll go ahead and assume you know how to set up a simple UI with two textboxes.
public class EditorViewModel { public EditorViewModel() { // We need to set up our 'scripting' environment, // we'll specify assembly references and using // statements here for now this.scripting = new CommonScriptEngine ( new [] { // References typeof(string).Assembly // mscorlib typeof(Logger).Assembly // our app }, new[] { // Using statements "System" }); } // ... UI stuff ... // private CommonScriptEngine scripting; private void ProcessInput() { var logger = new Logger(); // Parse the code we've typed into a node that we can // manipulate SyntaxNode root = Syntax.ParseCompilationUnit (Input); var rewriter = new LoggingRewriter(); // Here we hand our rewriter the node that we just // parsed and we'll get back the rewritten node. root = rewriter.Visit (root); try { // Create a new scripting session with the host // object that we created to log our state Session s = Session.Create (logger); // We hand the scripting engine our final code // to run and the scripting session to use. this.scripting.Execute (root.ToString(), s); } catch (CompilationErrorException cex) { // We'll ignore compilation errors, there'll be // plenty while you type. Outputting this, however, // is a great way to debug your syntax rewriter. } catch (Exception ex) { // We want to show any other errors, like those from // the code that you wrote. Output = ex.ToString(); return; } // At this point we have something usable, show the output from // the logging host object. Output = logger.Log; } }
And there we go, we can see any time we declare a new variable and what it’s value is. Now that we’ve got the infrastructure all set up, let’s go a little farther and handle all assignments. Assignments are binary expressions, they have a left (the variable) and a right (the value it’s being assigned to) side to them, so we’ll check each binary expression to see if it’s an assignment and rewrite it’s value appropriately:
protected override SyntaxNode VisitBinaryExpression (BinaryExpressionSyntax node) { switch (node.Kind) { case SyntaxKind.AssignExpression: IdentifierNameSyntax identifierSyntax = (IdentifierNameSyntax) node.Left; // We just want the variable name string name = identifierSyntax.PlainName; // Get a logging expression for the value ExpressionSyntax logExpression = GetLogExpression (name, node.Right); // Update the node with our logging expression return node.Update (node.Left, node.OperatorToken, logExpression); default: return base.VisitBinaryExpression (node); } }
Update (4/18): The above code has a bug! This works great for simple expressions like x = 5, but what about something more complex like
i < x = 5
In this case, the left side of the binary expression is:
i < xCasting this as a name will obviously not work, so we need to do a search for the name. We’ll look at the right side of any binary expression we come across and return once we find an IdentifierName:
private IdentifierNameSyntax FindIdentifierName (ExpressionSyntax expression) { // If we're at a name, we can just return it IdentifierNameSyntax name = expression as IdentifierNameSyntax; if (name != null) return name; // If we find a binary expression, look at the right side BinaryExpressionSyntax binaryExpression = expression as BinaryExpressionSyntax; if (binaryExpression != null) return FindIdentifierName (binaryExpression.Right); // If we don't know what to do, return null so we explode and we can find the bug ;) return null; }
Now, we’ll replace a line in our previous code to use this:
IdentifierNameSyntax identifierSyntax = FindIdentifierName (node.Left);
And there we have logging simple assignments! In part two, we’ll expand upon this with more assignments and pre/postfix unary operators. Now, if you’d like to play with this now, install the Roslyn CTP and grab the prototype source from GitHub. Fair warning though, it’s changing rapidly and may not work at any given point.
Note: This post is based on the first Roslyn CTP, I intend to update it when a new version is released.
