Saturday, August 22, 2009

Learn About Your Visitors Using Google Analytics

Google Analytics is a great source of information about your visitors. One of the most useful features that I find there is sources of traffic and the list of keywords that sent visitors from Google search to our site. Here are some searches that resulted in a visit:

algorithm ad serving
open source ad management and serving solutions
open source mobile ad server
adserving api
java flex adserver
"ad server" java open source
ad network sample code
ad servers java
ads server for social network
adserver algorithm
adserverbeans performance
advertisement server with flex application
arhitecture for wireless banner ads network
best adserver software
flex ad serving


What I find interesting about those searches is that many of them are probably run by techies not marketing people, there are many searches inquiring about AdServerBeans, its performance, references. People want a comparison of different solutions. People are looking for an ad server implementation with a particular combination of technologies. There are queries related to different niches of ad serving like mobile or social network.
Analyzing these queries helps to provide better answers to what people are looking for. It also make you think in the right direction when making plans for the future.

Thursday, July 30, 2009

We used to start Winstone servlet to run our two sites http://www.adserverbeans.com and http://www.adserversoft.com But it brought some complications. Particularly to host 2 domain names with Winstone you should create folders like www.adserverbeans.com and www.adserversoft.com and set the parameter hostsDir to the location of those folders. When you try to access a domain name without www a default host is taken. Which is wrong: you get to a different site. You can create soft links to www folders, which I did. But since we have a demo (Java app) it resulted in deploying the same app two times within the same classpath. Due to the fact that application specific libraries are loaded with the root classloader (mea culpa) those apps couldn't be deployed (conflict of static fields).
So the solution was this. I took Nginx, configured it for two sites and one reverse proxied location. Then I started Winstone on localhost port 8081 without hosts.
It's amazing how fast Nginx serves the pages of our sites in comparison with Winstone. I could see the difference during testing.


server {

listen 80;
server_name adserverbeans.com www.adserverbeans.com;

access_log /usr/local/nginx/adserverbeans.com/log/access.log;
error_log /usr/local/nginx/adserverbeans.com/log/error.log;

location / {

root /home/asb/asb/asb-site/;
index index.html;

}

location /adserverbeans {
proxy_pass http://127.0.0.1:8081/adserverbeans;
proxy_redirect off;

proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;

client_max_body_size 10m;
client_body_buffer_size 128k;

proxy_connect_timeout 90;
proxy_send_timeout 90;
}
}

server {

listen 80;
server_name adserversoft.com www.adserversoft.com;

access_log /usr/local/nginx/adserversoft.com/log/access.log;
error_log /usr/local/nginx/adserversoft.com/log/error.log;

location / {

root /home/asb/asb/adserversoft/;
index index.html;

}

}

Wednesday, July 22, 2009

ASB Prime Time

We have started to work on a subproject code-named ASB Prime Time. Our idea is to create a series of open source ad serving products with different logic. Prime Time has the simplest and the most straightforward logic we can think of. Advertisers just rent out ad placements on a timely basis. First publishers register sites and ad places on them and set prices for different hours, e.g. 17-23, Monday-Friday - $10 per hour. Then they might have higher rates on weekends and lower rates during the night. Basically that's why we call it Prime TIme - rates will depend on the time of serving, similar to ads displayed on TV.
Advertisers make their bookings on those ad places. In the first version the time will be that of the ad server. Later we will deduce the local time of the user based on the IP and geo location of the visitor.
A few rules for the users:
- advertisers cannot cancel their bookings once they are made (it makes sense since publishers need to plan ahead and they don't want to loose their ad traffic)
- advertisers can make new bookings within the same campaign
- publishers can change rates but they will affect only the new bookings
- publishers get what advertisers spend (no commission fee charged currently in the system)

The targeted users of this system are large publishers that have exclusive contracts with advertisers.

Thursday, June 4, 2009

Using Nginx To Write Ad Server Module

Nginx is a powerful http server written by a Russian programmer Igor Sysoev. I suspect that Rambler (one of the most popular Russian search engines) is powered by Nginx.

We are currently rewriting the ad server module of Ad Server Beans in C using Nginx as the platform for serving our banner files, ad codes and registering actions and clicks.

The first version of Ad Server Beans will be all-Java though. In fact we are not going to abandon the Java branch of the ad server module some time in the future. Java is very powerful and flexible in terms of distributed deployment and development.

The reason why I want to use C and Nginx for the ad server module is mostly memory. It gets quite nasty in Java when you start caching data in memory using TreeMaps, HashMaps and Sets. It grows exponentially and takes up processor time to garbage collect.

C is different in that respect. You always know the size of your structs, pointers and you can easily calculate the memory consumption.

I'm not saying Java is bad, C is good. I guess I should say now I am a bad Java programmer and can't keep my memory accurate. Full stop.

Let's have a look at what I have written so far.

First I wrote a function that returns a pointer to ngx_str_t based on the the ngx_http_request_t.. This is the entry point for our module designed to serve ad codes.


ngx_str_t*
asb_http_ngx_handler(ngx_http_request_t *r)
{
asb_http_request_t *http_params;

http_params = asb_get_http_params(r);
return asb_fill_template(http_params);
}


In fact there is some boilerplate code necessary to create an nginx module. I copied it from the empty_gif module. To avoid browser caching you just shouldn't set r->headers_out.last_modified_time (where r is ngx_http_request_t *). In my case I commented out all lines that were setting it.

Next I created a struct that reflects the request parameters converted to convenient types for ASB. The following function does that:


struct asb_http_request_s {
u_char *target_url;
u_char *target_window;
ngx_int_t image_id;
u_char *image_src;
ngx_int_t width;
ngx_int_t height;
u_char *alt_text;
u_char *window_status;

ngx_int_t banner_id;
ngx_int_t event_id;
};

typedef struct asb_http_request_s asb_http_request_t;

asb_http_request_t*
asb_get_http_params(ngx_http_request_t *r)
{
asb_http_request_t *res;
ngx_str_t banner_id;
ngx_str_t event_id;

res = (asb_http_request_t*)malloc(sizeof(asb_http_request_t));
if (ngx_http_arg(r, (u_char *) "bannerId", 8, &banner_id) == NGX_OK) {
ngx_int_t b_id = ngx_atoi(banner_id.data, banner_id.len);
res->banner_id = b_id;
}else{
res->banner_id = 0;
}
if (ngx_http_arg(r, (u_char *) "eventId", 7, &event_id) == NGX_OK) {
ngx_int_t e_id = ngx_atoi(event_id.data, event_id.len);
res->event_id = e_id;
}else{
res->event_id = 0;
}
res->target_url = (u_char*)"http://www.adserverbeans.com/asb?eventId=4";
res->target_window = (u_char*)"_blank";
res->image_id = 1;
res->image_src = (u_char*)"http://www.adserverbeans.com";
res->width = 468;
res->height = 60;
res->alt_text = (u_char*)"Alt text!";
res->window_status = (u_char*)"window status!";

return res;
}


The strategy is similar to that of the Java implementation. Create an object that first collects converted request parameters then after banner selection occurs it is filled with values related to the selected banner. Finally we are using this object to fill our ad code template with them.
I haven't implemented the banner-selection algorithm in C yet but the template filling function looks like this:


ngx_str_t* asb_fill_template(asb_http_request_t* http_params)
{
ngx_str_t *res;
u_char *re;

re = (u_char*)calloc(asb_adcode_image_js_len+100,sizeof(u_char));
ngx_sprintf(re,ASB_ADCODE_IMAGE_JS,
http_params->target_url,
http_params->target_window,
http_params->image_id,
http_params->image_src,
http_params->width,
http_params->height,
http_params->alt_text,
http_params->window_status);

res = (ngx_str_t*)malloc(sizeof(ngx_str_t));
res->data = re;
res->len = ngx_strlen(re);

return res;
}


I don't like the idea of using malloc/calloc since nginx has its own overhead for these functions. But I still need to figure out the benefit of using nginx's memory management functions.

Next stop is banner selection. I am playing around with examples for mysql connector/c as I will need to retrieve data from our database and store it in memory.

Tuesday, June 2, 2009

Calling C-Functions From Assembly On Linux

A good start on using assembly to call c-functions is "Assembly Language Step-by-Step: Programming with DOS and Linux" by Jeff Duntemann.
Starting from chapter 14 (The Programmer's View of Linux Tools and Skills to Help You Write Assembly Code under a True 32-Bit OS) it gets hot.
Here is how assembly code looks like if you want to print a message to the console using the puts c-function:


; Source name : EATLINUX.ASM ; Executable name : EATLINUX
; Version : 1.0
; Created date : 11/12/1999
; Last update : 11/22/1999
; Author : Jeff Duntemann
; Description : A simple program in assembly for Linux, using NASM 0.98,
; demonstrating the use of the puts C library routine to display text.
;
; Build using these commands:
; nasm -f elf -g eatlinux.asm
; gcc eatlinux.o -o eatlinux
;
[SECTION .text] ; Section containing code
extern puts
global main ; Required so linker can find entry point
main:
push ebp ; Set up stack frame for debugger
mov ebp,esp
push ebx ; Program must preserve ebp, ebx, esi, & edi
push esi
push edi
;;; Everything before this is boilerplate; use it for all ordinary apps!
push dword eatmsg ; Push a 32-bit pointer to the message on the stack
call puts ; Call the clib function for displaying strings
add esp, 4 ; Clean stack by adjusting esp back 4 bytes
;;; Everything after this is boilerplate; use it for all ordinary apps!
pop edi ; Restore saved registers
pop esi
pop ebx
mov esp,ebp ; Destroy stack frame before returning
pop ebp
ret ; Return control to Linux
[SECTION .data] ; Section containing initialized data
eatmsg: db "Eat at Joe's!",10,0
[SECTION .bss] ; Section containing uninitialized data


Note: the width of the box with the code above is too little to display the comments due to the template limitation used on this blog. Use copy-and-paste to see all the text.

You can see that to call a function from the C library you need to push its parameters on the stack, call it actually and adjust your stack pointer. Multiple arguments are pushed from right to left. Stack adjustment is 4 x number of arguments. Return value is put into eax. If a function changes the value of its argument (when passed as a pointer) you can read the new value from the address of this argument (label).

Note the differences from the previous example. We need to extern our functions, use global main and ret. Also gcc instead of ld is used to link the object.

Monday, June 1, 2009

Setting Up Environment For Linux Assembly Programming

In this post I will describe the tools and setup that I am using to write, assemble and debug assembly code on Linux (Debian Etch). There are probably better ways out there to setup your assembly development environment, I'm just sharing with you my solution after spending some time looking for a smooth development process.

First you need to type or copy-and-paste your assembly code into a text file. I'm using Kate, vim can also do the job. Create a new file and name 'hello.asm'



section .text
global _start ;must be declared for linker (ld)
_start: ;tell linker entry point
mov edx,len ;message length
mov ecx,msg ;message to write
mov ebx,1 ;file descriptor (stdout)
mov eax,4 ;system call number (sys_write)
int 0x80 ;call kernel
mov eax,1 ;system call number (sys_exit)
int 0x80 ;call kernel section .data
msg db 'Hello, world!',0xa ;our dear string
len equ $ - msg ;length of our dear string


Next you need to assemble and link the object file.
nasm -f elf -g hello.asm
ld -o hello hello.o


Note: if you are going to use calls to c-functions in your assembly you will need to use a different convention in your assembly and use gcc for linking. More on that in the next post.

At this point you should have an executable named 'hello'.

vitaly@geniot:/opt/projects/asm$ ./hello
Hello, world!


Let's debug it now! We are going to use DDD for that (apt-get install ddd).
Start DDD like this:

vitaly@geniot:/opt/projects/asm$ ddd hello

This will tell DDD and gdb where to find symbols and the source code. DDD is a little bit buggy. For example you cannot put a breakpoint on the first line of code. But you can always use nop for that.

To step through the code use the commands tool (View->Commands Tool)
Use registers panel (Status->Registers) to see the state of registers.
Finally use Data->Memory to observe the state of your memory in the Data Window (View->Data Window).

Here is how it looks on my machine:








You can see the state of registers. edx contains the length of our string - 14, ecx - the address where our string is stored in the memory - 0x804809d. You can see a fragment of memory in the Data Window starting from this address. Since we've asked for string examining 1 byte Display is looking for the null terminator.

You may have the following problems while trying to set up your environment like this.
- DDD can't find the source code of your program, to fix that you can cd to the directory with the executable and the source file and start ddd with the name of the executable as an argument.
- gdb can't find symbols for the program. Make sure you assemble your object file with the -g argument.

Comments, suggestions and questions are welcome.