DIY – Your Rottentomatoes Ratings System for Redbox Movies

NOTE: Finally developed a Chrome plugin called Tomato Box out of this post! You can download it from Chrome Web Store, here. 

Update: My friend Pemba has made several enhancements to it, and ported the plugin for Firefox browser. You can download the source code of the Firefox add-on from here.

Every Friday, I rent a dvd from a Redbox kiosk near our place. I would always find a good movie, provided I spent a lot of time looking up the ratings of the available movies in rottentomatoes.com or imdb. It would have been nice if Redbox.com provided imdb or rottentomatoes ratings side by side.

To cut short the time I spent finding the right movie, I created a chrome plugin that displayed the ratings from rottentomatoes.com on the redbox.com movie pages. The image above is the screenshot of the ratings created by the plugin.

Rottentomatoes.com provides a few easy-to-use RESTful APIs for accessing movie information including the ratings from their data store.(here). They call their ratings as critics_score([C] in the screenshot) and audience_score([A]). One of the APIs searches the database and returns ratings, casts, year of release, and several other information for a movie. The API is invoked passing the name of the movie as a parameter. All API invocations are authenticated based on an API key passed with each call. The API key is obtained by registering with the rottentomatoes’ developers’ website.

I used a content script to parse the name of the movie from Redbox’s movie pages, and passed it to a node.js http server runnning locally. The call rate for my API key for rottentomatoes.com is limited to 10 calls/sec and 10000 calls per day. So I used the node.js server as a poor man cache so as to return the ratings locally before hitting the rottentomatoes datastore. I also used a simple scheduler in the content script to schedule less than 4 calls per second.

The complete sourcecode of this prototype is available in Git, here. The instructions to install the plugin are similar to those in my previous post, here.

Here is the code for the node server. The poor man cache is a set of files after the name of the movie.

//change API_KEY to your auth key obtained from rottentomatoes.com
var API_KEY = "YOUR_API_KEY";

var sys = require('util'),
    http = require('http'),
    fs = require('fs'),
    index;

function respond(content, response, movie, stringify)
{
    var data;
    if(content == null){
        content = new Object();
        content.critics_score = "none";
        content.audience_score = "none";
        content.id = "";
    }
    if(stringify == false){
        data = content;
    }else{
        data = JSON.stringify(content);
    }
    console.log("Sending data:"+ data);
    response.writeHead(200, {'Content-Length': data.length, 'Content-Type': 'text/json'});
    response.write(data);
    response.end();
    return data;
}

function findRanking(movie, response)
{
    try{
    var moviePath = "./movies/" + movie;
    fs.lstat(moviePath, function(err, stats){
      //return the ranking by reading the file
      if(stats && stats.isFile()){
        var content = fs.readFileSync(moviePath);
        var data = respond(content, response, movie, false);
        console.log("Found Movie:" +movie +" data:" + data);
      }else{
         //else query rotten tomatoes directly
         getRankings(movie, function(id, ratings){
              if(id == null){
                   respond(null, response, movie, true);
                   return;
              }
              if(ratings){
                   if(ratings == null){
                      respond(null, response, movie, true);
                      return;
                   }
                   console.log("critics ranking is:" + ratings.critics_score + ". audience ranking is:" + ratings.audience_score);
                   var content = new Object();
                   content.critics_score = ratings.critics_score;
                   content.audience_score = ratings.audience_score;
                   content.id = id;
                   content.name = movie;
                   var jsonContent = respond(content, response, movie, true);
                   try{
                       var ifile = fs.openSync(moviePath, "w");
                       fs.writeSync(ifile, jsonContent);
                       fs.close(ifile);
                   }catch(err){
                       console.log("Exception caught:" + err);
                   }
              }
         });
      }
    });
    }catch(err){
        console.log("Exception:" + err);
        respond(null, response, movie, true);
    }
}

function getRankings(movie, callback) {
    console.log("movie is :" + movie);
    var urlPath = "/api/public/v1.0/movies.json?q=" + encodeURIComponent(movie) + "&page_limit=5&page=1&apikey=" + API_KEY;
    var options = {
      host: 'api.rottentomatoes.com',
      port: 80,
      path: urlPath,
      method: 'GET'
    };
    var msgBody = "";
    var req = http.request(options, function(res) {
            console.log('STATUS: ' + res.statusCode);
            res.setEncoding('utf8');
            res.on('data', function (chunk) {
                msgBody += chunk;
            });
            res.on('end', function(){
                if(msgBody!= "" && res.statusCode == 200){
                    var movieObject = JSON.parse(msgBody);
                    if(movieObject.movies == null || movieObject.movies.length == 0){
                       callback(null, null);
                       return;
                    }
                    if(movieObject.movies.length == 1){
                      // only a single movie - lets assume that this is THE movie we are looking for
                      var item = movieObject.movies[0];
                      callback(item.id, item.ratings);
                      return;
                    }

                    // tallying the results with year of release, etc.  will yeild more accurate information
                    for(var i = 0; i<movieObject.movies.length; i++){
                        var item = movieObject.movies[i];
                        if(movie == item.title.toUpperCase()){
                           var id = item.id;
                           console.log("Id of the movie is:" + id);
                           callback(id, item.ratings);
                           return;
                        }else{
                          console.log("title=[" + movie + "] and movie=[" + item.title + "] does not match.");
                        }
                    }
                    callback(null, null);
                }else{
                    callback(null, null);
                }
            });
    });

    req.on('error', function(e) {
            console.log('problem with request: ' + e.message);
    });

    req.end();
}

function normalize(name)
{
    var decoded  = decodeURIComponent(name);
    var replaced = decoded.replace(/\(.*\)/,"");
    var trimmed = replaced.trim().toUpperCase();
    return trimmed;
}
http.createServer(function (request, response) {
  console.log(request.url);
  var result = request.url.match(/^\/(.*\.js)/);
  if(result) {
      response.writeHead(200, {'Content-Type': 'text/javascript'});
      response.write(readFile(result[1]));
      response.end();
  } else if(request.url.indexOf("add?&id=") != -1){
      var vals = request.url.split("=");
      console.log(vals[vals.length-1]);
      var nMovie = normalize(vals[vals.length-1]);
      findRanking(nMovie, response);
  }
  else{
      response.writeHead(200, {'Content-Type': 'text/html'});
      response.write(readFile('index.html'));
      response.end();
  }
}).listen(8989);

console.log('Server running at http://127.0.0.1:8989/');

Here is the code for the content script. The trick was to control the rate of calls to the server, which was achieved using windows.setTimeout API with a queue of parsed movies.

function MovieHandler(target){
    this.target = target;
}

MovieHandler.prototype.getTarget = function(){
    return this.target;
}

MovieHandler.prototype.callback = function(content){
     if(content){
        console.log("critics_score is :" + content.critics_score );
        console.log("audience_score is :" + content.audience_score);
        $(this).append("<em><strong>[C]</strong>" + content.critics_score + " <strong>[A]</strong>" + content.audience_score + "</em>");
     }
};
var doAjax = function(dataString, onSuccess){
    $.ajax({
        type: "GET",
        url: "http://127.0.0.1:8989/add?",
        data: dataString,
        context: onSuccess.getTarget(),
        success: onSuccess.callback
    });
};

var movieHandlers = new Array();
$.each( $(".box-wrapper"), function(index, obj){
        var movieHandler = new MovieHandler(obj);
        movieHandlers.push(movieHandler);
});
function schedule(){
    console.log("schedule invoked. queue size=" + movieHandlers.length);
    var i =0;
    var movieHandler;
    while((i < 2) && (movieHandlers.length > 0)){
        movieHandler = movieHandlers.shift();
        var name = $(movieHandler.getTarget()).attr('name');
        if(name != null){
            console.log("Querying movie:" + name);
            doAjax("id=" + name, movieHandler);
            i++;
        }
    }
    if(i == 2){
        setTimeout(schedule, 2000);
    }
}
console.log("scheduling timer 2 sec");
//found that ideally 2 calls/per sec is optimal.
//playing safe.
setTimeout(schedule, 2000);

Finally, the code for the plugin’s manifest. I added the URL of the node server ( http://127.0.0.1:8989) in the permissions section which was required for cross-domain XHR.

{
    "name": "REDBOX ranking",
        "description": "Parses videos out of Redbox and ranks",
        "version": "0.1",
        "permissions": ["contextMenus", "http://127.0.0.1:8989/"],
        "content_security_policy": "default-src 'self'",
        "content_scripts": [
            {
                "matches": ["*://*.redbox.com/*"],
                "js": ["jquery.min.js","parser.js"]
            }
        ]

}
Posted in Open Source, Technology, Thoughts | Tagged , , , | 5 Comments

How to scare off a Java developer without a pointer ?

Ask him this. 

What will the following program print ? 

struct counter
{
    int val;
    counter():val(0){}
};
int main()
{
    std::vector<counter> counters;
    counters.push_back(counter());
    counters.push_back(counter());
    for(int i = 0; i < counters.size(); i++){
        counter t = counters[i];
        t.val++;
    }
    std::cout << counters[0].val << std::endl;
    return 0;
}

If he says zero ask him 🙂 how you will make it to print 1 with a single line change ?Don’t  allow him to 1) change the std::cout line 2) change the initial value of counter.val. Watch him curse C++ 🙂

 

Posted in Uncategorized | Tagged , , | 1 Comment

DIY – Your Minimal Media Server with Node.js and Content Scripts

This entry  is about a tool that can mine the music blogs for youtube videos, and  create a nice playlist to be played latter on. I call it the Minimal Media Server, because it’s actually a local HTTP server in your computer that you will use to push videos using REST like calls and play them back using a customized youtube player.

I have created this prototype  with Node.js and Content-Script with a few hours of effort.  The source code is available in git,  here  . The code base is actually pretty small ~150 loc !

In order to prevent threats from JavaScript execution, browsers run JavaScript loaded by a web page  in a lower privilege mode. That limits JavaScript’s access only to the web page it is in.  In order to mine the web pages in different blogs,  I needed to run some privileged code on the browser. Such code should execute across all of the pages  I opened in the browser. Google Chrome supports a concept called Content-Script that enables such JavaScript  to run in an elevated privilege in the browser to access and update html content of the loaded web pages. Content-Script is similar to browser extensions in terms of privilege and the script author can specify when to run such script.

I wrote the following content-script to parse youtube videos embedded in the blogs:

var doAjax = function(dataString, onSuccess){
    $.ajax({
        type: "GET",
        url: "http://127.0.0.1:8888/add?",
        data: dataString,
        success: onSuccess
    });
};
var onSuccess = function(response) {
    console.log("added to the playlist");
};

$.each( $("iframe"), function(index, obj){
        var src = $(obj).attr('src');
        if(src.indexOf("youtube.com/embed/") != -1) {
            var values = src.split('/');
            if(values.length == 5) {
                doAjax("id=" + values[4], onSuccess);
            }
        }
});

Typically a video from youtube is embedded inside an iframe by blog sites like blogspot. Hence the script execution starts towards line number 13, from $.each.., that iterates over all iframes in the web page loaded. The typical format of an embedded youtube video URL is, “http://www.youtube.com/embed/%5Bvideo_id%5D&#8221;. The script looks for the string “youtube.com/embed”    in the src attribute of the iframe, copies the video_id portion, and, sends it towards the HTTP server that is listening on 127.0.0.1:8888 via an AJAX call.

There are a few points I want to highlight regarding the above script.  First, content-script  works only in a chrome browser. Second, there is a piece of metadata associated with the script to make it  a content script, in a file called “manifest.json”.

{
    "name": "Video parser",
        "description": "Parses videos out of http://www.youtube.com/embed/[video_id]",
        "version": "0.1",
        "permissions": ["contextMenus"],
        "content_security_policy": "default-src 'self'",
        "content_scripts": [
            {
                "matches": ["*://*/*"],
                "js": ["jquery.min.js","parser.js"]
            }
        ]

}

The most important part of the metadata file is the “content_scripts” portion. The “js” attribute indicates which JavaScripts are to be treated as content-scripts.  “parser.js” is the JavaScript described above. Since we are using JQuery inside parser.js we want to load JQuery as content-script as well.  The parser script is executed for each of the webpage loaded in the browser that matches the regex in the “matches” attribute, in this case for all web pages. It can be tweaked to limit to only a few domains. And finally, the content-script can be installed in Chrome by navigating to “chrome://settings/extensions” and then loading the content-script as Unpacked Extension in the developer mode. Here is a screen-shot:

Let’s discuss the server. The server listens to HTTP requests at 127.0.0.1:8888 and it is written using JavaScript, deployed as a Node.js application. Node.js provides several ready-to-deploy modules to develop and deploy client-server applications with minimal effort. It internally uses the Google Chrome’s V8 Javascript Runtime Engine which enables scripting of such applications in JavaScript. Node.js is trending now. There are several cloud vendors (no.de, etc.) who let you deploy node applications on internet.

With the content-script in place, I tried to implement a server that would store the video ids retrived by the script and provide a way to play them back as a playlist. The main server logic is the following piece of code:

var sys = require('util'),
    http = require('http'),
    fs = require('fs');

function readFile(filename) {
    var content = "hello";
    content = fs.readFileSync('./' + filename);
    return content;
}
// Execution starts here
http.createServer(function (request, response) {
  console.log(request.url);
  var result = request.url.match(/^\/(.*\.js)/);
  if(result) {
      // Serve JavaScript
      response.writeHead(200, {'Content-Type': 'text/javascript'});
      response.write(readFile(result[1]));
  } else if(request.url.indexOf("add?&id=") != -1){
      // Process video_ids
      var vals = request.url.split("=");
      console.log(vals[vals.length-1]);
      var playlist = fs.openSync("./playlist.csv", "a+");
      fs.writeSync(playlist, ", '" + vals[vals.length -1 ] + "'");
      fs.close(playlist);
  }
  else{
      // Serve index.html
      var ifile = fs.openSync("./playlist.js", "w");
      var icontent = readFile("./playlist.csv");
      var list = "var playlist = [" + icontent + "];\n";
      console.log(list);
      fs.writeSync(ifile, list);
      fs.close(ifile);
      response.writeHead(200, {'Content-Type': 'text/html'});
      response.write(readFile('index.html'));
  }
  response.end();
}).listen(8888);

console.log('Server running at http://127.0.0.1:8888/');

The server can be started  as “node server.js” from the shell.  The server logic is executed in the callback passed to the createServer method. As a client connects to 127.0.0.1:8888 over HTTP, the callback is executed. In our case, as the content-script parses a video_id, it makes an HTTP GET call to 127.0.0.1:8888. As a result, the callback is invoked, the video_id from the GET request is retrieved, and stored in a file called playlist.csv. On the otherhand, when the user points the browser to http://127.0.0.1:8888/,  the callback serves out the index.html file and a few JavaScript, that load an youtube player to play the videos in the playlist.csv.

Here is the content of the index.html


<script type="text/javascript" src="jquery.min.js"></script>
<script type="text/javascript" src="swfobject.js"></script>
<script type="text/javascript" src="playlist.js">

</script><script type="text/javascript">
var player;
function onYouTubePlayerReady(playerId) {
    console.log("invoked onplayer ready");
    player = document.getElementById("myytplayer");
    player.loadPlaylist(playlist);
}
$(function(){
    var params = { allowScriptAccess: "always" };
    var atts = { id: "myytplayer" };
    swfobject.embedSWF("http://www.youtube.com/v/JzGAmLNwLjw?enablejsapi=1&version=3&playerapiid=ytplayer",
                                   "ytapiplayer", "425", "356", "8", null, null, params, atts);
    $("#prev_button").click(function() {
        if(player) {
          player.previousVideo();
        }
     });
    $("#next_button").click(function() {
        if(player) {
          player.nextVideo();
        }
     });
});

<div id="ytapiplayer">You need Flash player 8+ and JavaScript enabled to view this video.</div>
<div id="controls">
   <input id="prev_button" type="button" value="prev" />
   <input id="next_button" type="button" value="next" />
</div>

The player is loaded using the swjobject wrapper libraries which is found here http://code.google.com/p/swfobject/.

I have been using this setup against http://xonggit.blogspot.com/, one of the music blogs I visit.

In reality, I think the content script can be much smarter to not only recognize  youtube videos, but videos/audios from sites like esnips, vimeo, etc. May be even place a custom button next to a possible link to add an option to add the video to the playlist. The server can act as a proxy to the content sites, and provide an unified interface to the users. The player hosted in the cloud can be wrapped in phonegap for android and ios. Endless possibilities!

Posted in Open Source, Technology, Thoughts | Tagged , , | 2 Comments

XML Schema : Elements vs. Attributes

It is fairly common to encounter a task to convert data from a legacy format to XML.
Defining the XML schema is one part of the problem. Let’s assume that you want to convert some metadata originally defined in Compact NodeType Definition(CND) used in Java Content Repository 1.x/2.x specification.
Here is an example of such data type:

  [rsr:distfile] 
  - rsr:type (STRING) mandatory
  - rsr:releasenotes (STRING) mandatory
 

This weird looking section of code actually defines a type called “rsr:distfile” with two attributes “rsr:type” and “rsr: releasenotes”.

A one-to-one attribute to attribute mapping will lead to this snippet of XML schema,

<xs:complexType name="distfile">
        <xs:attribute name="type" type="xs:string"/>
        <xs:attribute name=" releasenotes " type="xs:string"/>
 </xs:complexType>

However, this simple solution can cause nightmares, because – both the attributes are of string type, and, therefore, can hold any random string in the legacy store. You may have to URL escape the string content before adding them as XML attributes.

On the other hand, attribute-to-element mapping for the above case produces much cleaner XML.
Here is the new schema snippet:

<xs:complexType name="distfile">
    <xs:sequence>
<xs:element maxOccurs="1" minOccurs="1" name="type" type="xs:string"/>
<xs:element maxOccurs="1" minOccurs="1" name=" releasenotes " type="xs:string"/>        
    </xs:sequence>
</xs:complexType>

You now also have an option to treat the string content as CDATA.
Here is an example XML produced out of the above schema.

<distfile name="copyservice.wsdl">
      <type>wsdl<wsdl>   
      < releasenotes ><[!CDATA[
1.2.4 Added a new element  “<error>”
1.2.5 Removed  element  “<error>”
    ]]
    </ releasenotes >
</distfile>
Posted in Open Source, Technology, Thoughts | Leave a comment

Threads in Qualcomm Brew 3.1.5

One of the odd things in Brew is that its multitasking model  is cooperative.  In a cooperative model, the executing task must yield explicitly to the task scheduler so that the scheduler can dispatch another task.  So in addition to the creation of the thread, the programmer is responsible for yielding the execution and supply a resume point where the control will eventually return .  On the other hand, in traditional preemptive model, the programmer has to do little on his own; the scheduler preempts and schedules  threads as necessary.  That’s why an infinite loop is a sin in a cooperative model. POSIX threads are preemptive.  The threads in Brew are called IThread and they are, however, cooperative.

In Brew, a single system level thread is shared across all the user applications. All IThreads created by an application also share the  same system thread. It is  an example of many-to-one thread model.  This main thread of execution is used for dispatching both the user and the system level events to the application, such as key-press, low-battery, low-memory, etc. The thread is also used for invoking callbacks which can be registered with Brew for certain events, e.g., timeouts,  data availability on a socket, etc. In reality we cannot achieve true concurrency in Brew because of the many-to-one model.

I am not very sure why the thread model is many-to-one in Brew. Brew is deployed in a lot of low-end handsets for markets in developing countries. I suppose the model simplifies the integration of  Brew in the OEM’s OS.  It also reduces the overhead of error prone synchronization.  Apart from that, IThread can also act as an useful abstraction while grouping a set of asynchronous tasks,  and  ensuring  an order of execution among the tasks.

Here is an example: the INETMGR_GetHostByName API which is  equivalent to the GetHostByName of POSIX API is actually a two step operation in Brew. First, the user calls INETMGR_GetHostByName passing the domain name,  a global buffer and a callback as a parameter. Brew delegates the operation to the OS and returns. When the DNS is resolved, the buffer is populated and the callback is invoked from the main thread. This is an asynchronous behavior, and the developer must be prepared to handle the callback and then only proceed with other network operations.  In the following snippet the actual connect call is  made within the callback DNSLookupCallback, when the DNS is resolved.

void DnsLookup(MyApplet *pMe )
{
 //initialize the callback
 CALLBACK_Init(pMe->pcbDNSLookup, DNSLookupCallback, pMe);
 //call the async version of GetHostByName
 INETMGR_GetHostByName(
 pMe->piNet,
 pMe->pDNSRes,
 pMe->addr,
 pMe->pcbDNSLookup);

}
//this callback will be invoked when the DNS is resolved or an error
//occurred while calling GetHostByName
void DNSLookupCallback(MyApplet* pMe)
{

 int result;
 AEEDNSResult *pres = pMe->pDNSRes;
 if (pres->nResult > 0 && pres->nResult <= AEEDNSMAXADDRS ) {
 // We have an IP address
 pMe->inet_addr = pres->addrs[0];
 result = ISOCKPORT_OpenEx( pMe->n.pISockPort, pMe->wFamily, AEE_SOCKPORT_STREAM, 0 );
 TXTBLBIGDBGPRINTF("result %d", result);
 if(AEE_SUCCESS == result){
 TryConnect(pMe);
 }else{
 DBGPRINTF("failed to open port");
 }

 } else {
 DBGPRINTF("Error %d in DNS Lookup", pres->nResult);
 }

}

The same can be implemented using  the following IThread idioms, which looks  much simpler.

void DnsLookupSync(MyApplet *pMe )
{

 AEEDNSResult *pres = pMe->pDNSRes;
 //pMe->thread is the Ithread created
 AEECallback *pcb = ITHREAD_GetResumeCBK(pMe->thread);
 INETMGR_GetHostByName(
 pMe->n.m_piNet,
 pres,
 pMe->session.addr,
 pcb);
 //calling suspend from here. so that when DNS lookup completes
 //the execution will resume from the following line.
 ITHREAD_Suspend(pMe->thread);
 DBGPRINTF("result of lookup result:%d ",pres->nResult);
 if (pres->nResult > 0 && pres->nResult <= AEEDNSMAXADDRS ) {
 // have an IP address
 pMe->inet_addr = pres->addrs[0];
 ISOCKPORT_OpenEx( pMe->n.pISockPort, pMe->wFamily, AEE_SOCKPORT_STREAM, 0 );
 TryConnect(pMe);
 } else {
 DBGPRINTF("Error %d in DNS Lookup", pres->nResult);
 }
}

Creating an IThread is a however costly. An IThread needs a stack which is allocated from the heap available
to the application. Moreover with the many-to-one model, and cooperative nature, threads cannot help any better in multitasking. So it’s always better to reuse an IThread within an application. This can be achieved using a custom thread context.

Qualcomm ships an IThread based   implementation of the network connect/read/write  operations with Brew 3.1.5 SDK. In my computer the default install location of the implementation is C:\Program Files\BREW 3.1.5\sdk\src\thrdutil.  These APIs take an IThread as the additional parameter compared to their non-threaded counter part.  However, the implementation is a wrapper around the ISOCKET interface of Brew, which is no longer recommended for use. From version 3.1.3 onwards, the new ISOCKPORT API obsoletes ISOCKET as the way for socket programming.  Nevertheless, the code in thrdutils still serves as good reference for IThread usage.

We can reuse an IThread  if we use a model similar to thread pools.  Since threads, all events and callbacks share the same system thread, there is actually no point in having more than one thread in the pool.  In addition, we can use a queue  so that the tasks that are submitted to the pool for execution are processed in FIFO order.  Such ordering will also be useful in case we want to serialize the operations that modifies the data model of the application.

Here is an excerpt from an implementation I did:


//creates the pool. The Applet initialization
//function is the best place to create the pool.

CommandPool* createPool(MyApplet* pMe, uint16 queueSize)
{

 CommandPool* pPool = (CommandPool*)MALLOC(sizeof(CommandPool));
 if(pPool == NULL){
 DBGPRINTF("Unable to create pool");
 return NULL;
 }
 DBGPRINTF("Creating a pool with qsize %d", queueSize);
 pPool->pMe = pMe;
 pPool->queueSize = queueSize;
 pPool->queue = NULL;
 pPool->state = STOPPED_USER;
 pPool->pcbCurrent= NULL;
 pPool->pendingCb = 0;
 if(!startPool(pPool)){
 freePool(pPool);
 pPool = NULL;
 }
 return pPool;

}

static boolean startPool(CommandPool *pPool)
{
 int r;
 //create the thread
 r = ISHELL_CreateInstance( pPool->pMe->piShell, AEECLSID_THREAD,
 (void **)&pPool->cx.thread );
 if(SUCCESS !=r ){
 DBGPRINTF("Unable to create thread");
 return FALSE;
 }

 DBGPRINTF("Thread created successfully");
 //initialize the join callback for the IThread
 CALLBACK_Init( &pPool->cx.cbThreadJoin, (PFNNOTIFY)threadJoinCallback, (void *)pPool );
 //callback that should be invoked by tasks to indicate completion of the task
 pPool->cx.cbCompleted = resumeThread;
 //set the join callback
 ITHREAD_Join( pPool->cx.thread, &pPool->cx.cbThreadJoin,
 &pPool->cx.threadReturn );
 //start the thread
 DBGPRINTF("Thread starting ...");
 r = ITHREAD_Start( pPool->cx.thread, 128, (PFNTHREAD)run, (void*) pPool);
 if(SUCCESS !=r){
 DBGPRINTF("Unable to start thread %d", r);
 ITHREAD_Release(pPool->cx.thread);
 return FALSE;
 }
 //reset the state of pool
 pPool->state = STARTED;

 return TRUE;
}

The run function passed to ITHREAD_Start is the heart of the pool, that executes the tasks.


static void run(IThread* pt, CommandPool *pPool)
{
 AEECallback *pcb = NULL;

 //retrieve the resume callback and store a reference to it in the context
 pcb = ITHREAD_GetResumeCBK(pPool->cx.thread);
 pPool->cx.pcbResumeCallback = pcb;

 DBGPRINTF("pool=>state=%d",  pPool->state);

 while(STOPPED_USER != pPool->state)
 {
 //retrieve the next task from the queue
 ThreadTask *cb = NULL;
 if(NULL != pPool->queue){
 cb = (ThreadTask*)list_last(pPool->queue);
 }

 if(NULL == cb){
 DBGPRINTF("No element in the pools queue");

 }else{
 //delete the callback from the queue
 pPool->queue = list_delete(cb, pPool->queue);
 pPool->pendingCb--;
 //make it current callback
 pPool->pcbCurrent = cb;

 //invoke the task - which may involve several asynchronous steps.
 cb->pfnNotify(pPool, cb->data);

 }
 //now yield to the main thread
 DBGPRINTF("Yielding pool thread");
 ITHREAD_Suspend(pPool->cx.thread);

 //the thread execution is resumed. It may be due to addition of a new task to the queue
 //or from a completion of a task.

 DBGPRINTF("Resuming pool thread");
 //check if the callback has completed
 while(cb !=NULL && cb->pReserved != NULL ){
 boolean completed = cb->pReserved;
 if(completed == FALSE){
 DBGPRINTF("We are still processing the callback so lets yield again");
 ITHREAD_Suspend(pPool->cx.thread);
 DBGPRINTF("Waking up thread recheck callback ");
 cb = pPool->pcbCurrent;
 }else{
 DBGPRINTF("Cleaning up the callback 0x%x", cb);
 pPool->pcbCurrent = NULL;
 FREEIF(cb);
 cb = NULL;
 }

 }

 }
 ITHREAD_Exit(pPool->cx.thread, SUCCESS);
}

The user of the pool calls the following method to submit a task to the pool.


void execute(CommandPool* pPool, ThreadTask *pcb)
{

 if(pPool->pendingCb +1 > pPool->queueSize){
 DBGPRINTF("We cannot hold the callback. Please add  latter");
 if(pcb !=NULL && pcb->pfnCancel !=NULL){
 pcb->pfnCancel(pcb->pCancelData);
 }
 return;
 }

 pPool->queue = list_add((void*)pcb, pPool->queue);
 pPool->pendingCb++;
 DBGPRINTF("added callback 0x%x to pool.", pcb);

 //lets schedule a resume callback for the thread as a notification
 //for addition of a new task
 DBGPRINTF("adding a resume callback");
 ISHELL_Resume(pPool->pMe->piShell, pPool->cx.pcbResumeCallback);

}

When the task is complete, the user will notify the pool of the completion of the task by invoking
the following function, which can be accessed from the pool object passed to the task.


static void resumeThread(CommandPool *pPool)
{
 boolean *tmp = NULL;
 //indicate the callback has completed successfully
 if(pPool->pcbCurrent != NULL){
 DBGPRINTF("Setting completion flag in callback");
 pPool->pcbCurrent->pReserved = TRUE;
 }
 ISHELL_Resume( pPool->pMe->piShell, pPool->cx.pcbResumeCallback );
}

My two cents.

Posted in Brew | Tagged , | Leave a comment

Notes on Xmx, Xms and Xss JVM memory tuning

A 32 bit JVM process can  have a theoretical max heap size of 4 GB based on simple unsigned arithmetic.  However, it varies according to OS, and most OS limit it to 2-3 GB. [1]
The options -Xms, -Xmx configure the available Java Heap memory which is a portion of the total memory allocated for  the JVM process by the underlying OS.

According to the HotSpot JVM spec,  a Java thread maintains a single stack for the native code as well as Java code [2] and  therefore such a thread actually is a  native OS thread with its own stack. The -Xss option can be used to configure the native stack size.  The point is: the memory required for Java threads is deducted from the JVM  process’ memory but not from the Java Heap.  In other words,  if you see a “java.lang.OutOfMemoryError :unable to create new native thread” , consider adjusting your Xmx/ms( reduce it ) and Xss (reduce it as well, if the change in Xmx/ms does not work)  according to the following equation –

JVM memory = Xmx+(nos of threads)*Xss + any other memory requirement( e.g. PermGen space, etc.)

Note that default value of Xss depends on the OS , typically 256k-512k [3].

Posted in Java, JVM | Leave a comment