27 Dec 2009

35 Google open-source projects that you probably don't know

This text is translation of: 34 projekty Open Source udostępnione przez Google


Currently list is longer than 35 projects, during change from Polish to English I have added one new project - and this is why title says 35 instead 34 ;). After updates there are even more! Sorry for your confusion.

Google is one of the biggest companies supporting OpenSource movement, they released more than 500 open source projects(most of them are samples showing how to use their API). In this article I will try to write about most interesting and free releases from Google, some of them might be abandoned.


List of projects developed at Google and released as opensource (thanks @dobs from reddit) can be displayed also here

Text File processing

Google CRUSH (Custom Reporting Utilities for SHell)
CRUSH is a collection of tools for processing delimited-text data from the command line or in shell scripts. Tutorial how to use it is here

C++ libraries and sources

Google Breakpad
An open-source multi-platform crash reporting system. Breakpad is a minidump-generation library used for snapshotting processes out in the field for later analysis. The format is similar to core files but was developed by Microsoft for it's crash-uploading facility. A minidump-creation library for Mac/Linux has been implemented so that the crash-processing back-end only needs to understand one format.
Google GFlags
The gflags package contains a library that implements commandline flags processing. As such it's a replacement for getopt(). It has increased flexibility, including built-in support for C++ types like string. Here is introduction how to use it.
Google Glog
The glog library implements application-level logging. This library provides logging APIs based on C++-style streams and various helper macros. It can be used under Linux, BSD, and Windows. Here is introduction how to use Glog.
Google PerfTools
These tools are for use by developers so that they can create more robust applications. Especially of use to those developing multi-threaded applications in C++ with templates. Includes TCMalloc, heap-checker, heap-profiler and cpu-profiler. Instructions how to use PerfTools can be found here and here.
Google Sparse Hash
An extremely memory-efficient hash_map implementation. 2 bits/entry overhead. The SparseHash library contains several hash-map implementations, including implementations that optimize for space or speed. The Google sparsehash package consists of two hashtable implementations: sparse, which is designed to be very space efficient, and dense, which is designed to be very time efficient. For each one, the package provides both a hash-map and a hash-set, to mirror the classes in the common STL implementation. Docs are here.
Omaha - Google Update
Omaha, otherwise known as Google Update, is a program to install requested software and keep it up to date. So far, Omaha supports many Google products for Windows, including Google Chrome and Google Earth, but there is no reason for it to only support Google products. Here is Omaha Overview and Developers Setup Guide.
Protocol Buffers
Protocol Buffers are a way of encoding structured data in an efficient yet extensible format. Google uses Protocol Buffers for almost all of its internal RPC protocols and file formats. Here is developer guide, this protocol can be used in many languages and it is suported by few IDE - for example NetBeans

The Internet

Google Code Prettify
A Javascript module and CSS file that allows syntax highlighting of source code snippets in an html page. It supports: C/C++, Java, Python, Ruby, PHP, VisualBasic, AWK, Bash, SQL, HTML, XML, CSS, JavaScript, Makefiles and some Perl. Not supported: Smalltalk and all *CAML*. For example click here
SpriteMe - easy "CSS sprites"
SpriteMe makes it easy to create CSS sprites (connect many small images to one larger to reduce new connections to webserver when loading webpage). This projects is also available as service under: http://spriteme.org/.
Reducisaurus is a web service for minifying and serving CSS and JS files. Reducisaurus is based on YUI Compressor and runs on AppEngine.
JaikuEngine is a social microblogging platform that runs on AppEngine. JaikuEngine powers Jaiku.com. For the mobile client source, see: Jaiku Mobile client. Here is README for project
Selector Shell
The Selector Shell is a browser-based tool for testing what CSS becomes in different browsers. It works by taking some raw text, inserting a dynamic STYLE element into the HEAD with that raw text as its content, and then reading the CSSOM to see what the browser has parsed it into. It is written in Javascript. It can be tested here.
Google Feed Server
Google Feed Server is an open source Atom Publishing Protocol server based on the Apache Abdera framework. Google Feed Server provides a simple back end for data adapters, which allows developers to quickly deploy a feed for an existing data source such as a database. Google Feed Server also provides the Feed Server Client Tool (FSCT), which lets developers perform create, receive, update, and delete (CRUD) operations on a Feed Server feed. Here are links to start it up and get running.
Melange, the Spice of Creation
The goal of this project is to create a framework for representing Open Source contribution workflows, such as the existing Google Summer of Code TM (GSoC) program. Using this framework, it will be possible to host future Google Summer of Code programs (and other similar programs, such as the Google Highly Open Participation TM Contest, or GHOP) on Google App Engine. Here you can checkout Getting Started Guide
This project hunts down the fastest DNS servers available for your computer to use. namebench runs a fair and thorough benchmark using your web browser history, tcpdump output, or standardized datasets in order to provide an individualized recommendation. namebench is completely free and does not modify your system in any way. This project began as a 20% project at Google. namebench runs on Mac OS X, Windows, and UNIX, and is available with a graphical user interface as well as a command-line interface. BTW: Google has own free public caching DNS servers at ip: i
Rat Proxy
A semi-automated, largely passive web application security audit tool, optimized for an accurate and sensitive detection, and automatic annotation, of potential problems and security-relevant design patterns based on the observation of existing, user-initiated traffic in complex web 2.0 environments. It detects and prioritizes broad classes of security problems, such as dynamic cross-site trust model considerations, script inclusion issues, content serving problems, insufficient XSRF and XSS defenses, and much more. Docs are here. Project is written and maintained by Michał Zalewski (lcamtuf).
Top Draw is an image generation program. By using simple text scripts, based on the JavaScript programming language, Top Draw can create surprisingly complex and interesting images. The cool part is that the program has built in support for taking your image and installing it as your desktop image. There's even a Viewer application that can be installed in the menubar to automatically run with the parameters (such as the selected script, update interval) that you've specified. The projects is developed in XCode, and runs on: Mac OS X 10.5 (Leopard) or later.
Open source release of EtherPad, a web-based realtime collaborative document editor. This project exists mainly as an exhibition of the code, to help support those who want to run or modify their own etherpad servers, or for those who are curious about how etherpad's algorithms make realtime collaboration possible. Here are some instructions how to build etherpad, and screencast what is all about. Etherpad uses JavaScript, Java and Comet server for make real time collaboration make working.
Chromium is the open-source project behind Google Chrome. Chromoium project is about create a powerful platform for developing a new generation of web applications. There are not so many differences between Chrome and Chromium. Here are instructions how to build Chromium on Linux. Tere are also official releases of Chrome for Windows, Mac and Linux.
V8 Google's open source JavaScript engine
V8 is Google's open source JavaScript engine. V8 is written in C++ and is used in Google Chrome, the open source browser from Google. V8 implements ECMAScript as specified in ECMA-262, 3rd edition, and runs on Windows XP and Vista, Mac OS X 10.5 (Leopard), and Linux systems that use IA-32 or ARM processors. V8 can run standalone, or can be embedded into any C++ application, here are some helpfull docs how to begin.
Chromium OS
Chromium OS is an open-source project that aims to build an operating system that provides a fast, simple, and more secure computing experience for people who spend most of their time on the web. Sources are available on: http://git.chromium.org/ src
Android is the first free, open source, and fully customizable mobile platform. Android offers a full stack: an operating system, middleware, and key mobile applications. It also contains a rich set of APIs that allows third-party developers to develop great applications.

Tools for MySQL

Google MySQL Tools
Various tools for managing, maintaining, and improving the performance of MySQL databases, originally written by Google. This includes:
  • mypgrep.py - a tool, similar to pgrep, for managing mysql connections
  • compact_innodb.py - compacts innodb datafiles by dumping and reloading all tables
Google mMAIM
mMAIM's purpose is to make it easy to monitor and analyze MySQL servers and to easily integrate itself into any environment. It can show Master/Slave sync stats, some efficiency stats, can return statistics from most of the "show" command, and more!

Other projects

Stressful Application Test (stressapptest)
Stressful Application Test (or stressapptest, its unix name) tries to maximize randomized traffic to memory from processor and I/O, with the intent of creating a realistic high load situation in order to test the existing hardware devices in a computer. It has been used at Google for some time and now it is available under the apache 2.0 license. Here are some docs: Introduction, Installation Guide and User Guide
Pop and IMAP Troubleshooter
The POP and IMAP troubleshooter serves to diagnose and solve connection problems from client machines to email services. It reads the client configuration files (Outlook, Windows Mail, Thunderbird, etc.), checks the individual settings, and then attempts to create POP, IMAP, and SMTP connections using these settings. The troubleshooter is coded in C++ using the Qt environment. It can be used generically, or can be customized for the demands of a particular email service.
Openduckbill is a simple command line backup tool for Linux, which is capable of monitoring the files/directories marked for backups for any changes and transferring these changes either to a local backup directory or a remote NFS exported partition or to a remote ssh server using the very common, rsync command. Here is installation guide.
ZXing (pronounced "zebra crossing") is an open-source, multi-format 1D/2D barcode image processing library implemented in Java. Our focus is on using the built-in camera on mobile phones to photograph and decode barcodes on the device, without communicating with a server. As far I know it can be found on Android Platform. Checkout Getting stared guide, and chackout list of supported devices (My SonyEricson device is capable!).
Tesseract OCR Engine
The Tesseract OCR engine was one of the top 3 engines in the 1995 UNLV Accuracy test. Between 1995 and 2006 it had little work done on it, but it is probably one of the most accurate open source OCR engines available. The source code will read a binary, grey or color image and output text. A tiff reader is built in that will read uncompressed TIFF images, or libtiff can be added to read compressed images. Here is: Readme and FAQ
Neatx - Open Source NX server
Neatx is an Open Source NX server, similar to the commercial NX server from NoMachine. For more information checkout Project Homeppage. NX protocol is way more roboust than VNC (it can be usefull when having slow Internet connection). Major differences between NX and VNC: Alternative to Google project can be FreeNx (not tested).
It is the code of the following paper: http://books.nips.cc/papers/files/nips20/NIPS2007_0435.pdf. This is an all-kernel-support version of SVM, which can parallel run on multiple machines. Here is usage.
The GO programming language
New programming language developed in Google. It is released using this slogan: "GO a systems programming language expressive, concurrent, garbage-collected"
The Google Collections Library for Java
The Google Collections Library is a set of new collection types, implementations and related goodness for Java 5 and higher, brought to you by Google. It is a natural extension of the Java Collections Framework you already know and use.
Google styleguide
Every major open-source project has its own style guide: a set of conventions (sometimes arbitrary) about how to write code for that project. It is much easier to understand a large codebase when all the code in it is in a consistent style. "Style" covers a lot of ground, from “use camelCase for variable names” to “never use global variables” to “never use exceptions.” This project holds the style guidelines we use for Google code. If you are modifying a project that originated at Google, you may be pointed to this page to see the style guides that apply to that project. This is worth reading.


Google is one of the most active companies releasing open source software, on top of that Google 5 times organized Summer Of Code - project where students from all over the world start working for OpenSource and Google pays them scholarship for few months of hard work.


Guice a lightweight dependency injection framework for Java 5 and above
Thanks JavaBeat for summary.Google Guice is a Dependency Injection Framework that can be used by Applications where Relation-ship/Dependency between Business Objects have to be maintained manually in the Application code. Since Guice support Java 5.0, it takes the benefit of Generics and Annotations thereby making the code type-safe.Documentation is here: Getting stared guide
Google Sitebrics - web framework powered by Guice
Sitebricks is a simple development layer for web applications built on top of Google Guice. Sitebricks focuses on early error detection, low-footprint code, and fast development. Like Guice, it also balances idiomatic Java with an emphasis on concise code.
Here is Getting Started guide and 5 minute tutorial.
Google ctemplate
CTemplate is a simple but powerful template language for C++. It emphasizes separating logic from presentation: it is impossible to embed application logic in this template language. Here is some documentation.

Thanks nostrademons from reddit.com
Google C++ Mocking Framework
This project was inspired by jMock, EasyMock, and Hamcrest, and designed with C++'s specifics in mind, Google C++ Mocking Framework (or Google Mock for short) is a library for writing and using C++ mock classes. Google Mock:
  • lets you create mock classes trivially using simple macros,
  • supports a rich set of matchers and actions,
  • handles unordered, partially ordered, or completely ordered expectations,
  • is extensible by users, and
  • works on Linux, Mac OS X, Windows, Windows Mobile, minGW, and Symbian.
Here is Getting Started guide, and Google C++ Mocking for dumies.

Thanks richq from reddit.com
Google C++ Testing Framework
Google's framework for writing C++ tests on a variety of platforms (Linux, Mac OS X, Windows, Cygwin, Windows CE, and Symbian). Based on the xUnit architecture. Supports automatic test discovery, a rich set of assertions, user-defined assertions, death tests, fatal and non-fatal failures, value- and type-parameterized tests, various options for running the tests, and XML test report generation. Here is Google Test Primer and here is Google Test Dev Guide.

Thanks richq from reddit.com
Google Toolbox for Mac
Is collection of source code from different Google projects, that may be useful to developers working on Macintosh. This package includes the Google Developer Spotlight Importers. The release notes can be found here.

Thanks buffi from reddit.com
This is not entirely Google Project but it is donated by Google. OCRopus(tm) is a state-of-the-art document analysis and OCR system, featuring pluggable layout analysis, pluggable character recognition, statistical natural language modelling, and multi-lingual capabilities. The OCRopus engine is based on two research projects: a high-performance handwriting recognizer developed in the mid-90's and deployed by the US Census bureau, and novel high-performance layout analysis methods. OCRopus is development is sponsored by Google and is initially intended for high-throughput, high-volume document conversion efforts. We expect that it will also be an excellent OCR system for many other applications. Here is usage guide and guide how to install development version

Thanks 13xforever from from reddit.com
Ganeti is a cluster virtual server management software tool built on top of existing virtualization technologies such as Xen or KVM and other Open Source software. Ganeti requires pre-installed virtualization software on your servers in order to function. Once installed, the tool will take over the management part of the virtual instances (Xen DomU), e.g. disk creation management, operating system installation for these instances (in co-operation with OS-specific install scripts), and startup, shutdown, failover between physical systems.

Thanks Matt Brown and btgeekboy from reddit.com
Skia is a complete 2D graphic library for drawing Text, Geometries, and Images.
  • 3x3 matrices w/ perspective
  • antialiasing, transparency, filters
  • shaders, xfermodes, maskfilters, patheffects
Projects using skia are: Android and Chrome.

Thanks zxn0 from reddit.com
Google URL parsing and canonicalization library
A small library for parsing and canonicalizing URLs. You can find README here.

Thanks pkasting
Libjingle, the Google Talk Voice and P2P Interoperability Library, is a set of components provided to interoperate with Google Talk's peer-to-peer file sharing and voice calling capabilities (in source are some samples how to build p2p app). The package includes source code for Google's implementation of Jingle and Jingle-Audio, two proposed extensions to the XMPP standard that are currently available in draft form both Windows and UNIX/Linux operating systems. Here is Developer Guide

Thanks jbking
WebDriver (Selenium)
Webdriver is sophisticated tool for automating web UI testing. It has a simple API designed to be easy to work with and can drive both real browsers, for testing javascript heavy applications, and a pure 'in memory' solution for faster testing of simpler applications. You can checkout the 5 minute introduction on GettingStarted page. Currently project is moved to http://selenium.googlecode.com/ For the latest source, please go there.

Thanks ittiam
Google Gears
Gears is an open source project that enables more powerful web applications, by adding new features to your web browser:
  • Let web applications interact naturally with your desktop
  • Store data locally in a fully-searchable database
  • Run JavaScript in the background to improve performance
Gears are the fastest way to make your web app more like desktop app

Thanks Anonymous.
Google Web Toolkit (GWT)
Google Web Toolkit (GWT) is a development toolkit for building and optimizing complex browser-based applications. GWT is used by many products at Google, including Google Wave and Google AdWords. It's open source, completely free, and used by thousands of developers around the world.

Thanks Anonymous.
Native Client
Native Client is an open-source technology for running native code in web applications, with the goal of maintaining the browser neutrality, OS portability, and safety that people expect from web apps. It has been released at an early stage to get feedback from the open-source community. Probably Native Client technology will help web developers to create richer and more dynamic browser-based applications. Native Client runs on 32-bit x86 systems that use Windows, Vista, Mac OS X, or Linux. Some ARM and x86-64 support is implemented in the source base, and we hope to make it available for application developers later this year. Here is Getting started guide and FAQ.

zxn0 and ptman from reddit.com

Currently Native Client can run Quake in your browser! :)
Google Gadgets for Linux
Google Gadgets for Linux provides a platform for running desktop gadgets under Linux, catering to the unique needs of Linux users. It's compatible with the gadgets written for Google Desktop for Windows as well as the Universal Gadgets on iGoogle. Following Linux norms, this project is open-sourced under the Apache License. Here is Getting Started Guide and instructions how to build project.

Thanks Tiger Dong
Google Caja
Caja allows websites to safely embed DHTML web applications from third parties, and enables rich interaction between the embedding page and the embedded applications.

Thanks phosphorescente from from reddit.com
Scarcity is a framework for concurrent garbage collection in C++. The framework is organized around the principle of "policy-based design", meaning that behavior are customized and extended via template parameters. Policy-based design facilitates seamless integration with a broad set of VMs and other runtime environments by allowing the host environment to replace any aspect of the framework, such as thread synchronization primitives, atomic data types, error logging facilities, tracing strategies and so on.
Google concurrency library
A concurrency library for C++. Here is getting started guide.
CppClean attempts to find problems in C++ source that slow development particularly in large code bases. It is similar to lint; however, CppClean focuses on finding global inter-module problems rather than local problems similar to other static analysis tools. The goal is to find problems that slow development in large code bases that are modified over time leaving unused code. This code can come in many forms from unused functions, methods, data members, types, etc to unnecessary #include directives. Unnecessary #includes can cause considerable extra compiles increasing the edit-compile-run cycle.

Here are some details about implementation
Unladen swallow
An optimized branch of CPython, intended to be fully compatible and significantly faster. Unladen Swallow is Google-sponsored, but not Google-owned. The engineers on the project are full-time Google engineers, but ultimately this an open-source project, not really that different from Chrome or Google Web Toolkit. Here is Getting Started Guide.

Thanks Anonymous
Closure Tools
The Closure tools help developers to build rich web applications with JavaScript that is both powerful and efficient. The Closure Compiler compiles JavaScript into compact, high-performance code. The Closure Library is a broad, well-tested, modular, and cross-browser JavaScript library. Closure Templates simplify the task of dynamically generating HTML. Here is documentation.

Thanks Anonymous
SPDY is an experiment with protocols for the web. Its goal is to reduce the latency of web pages. SPDY (pronounced "SPeeDY") is an application-layer protocol for transporting content over the web, designed specifically for minimal latency. There is SPDY-enabled Google Chrome browser and open-source web server. In lab tests, Google team had observed up to 64% reductions in page load times when using SPDY.

Thanks Anoop.

Update #2

There are a variety of C unit testing frameworks available however many of them are fairly complex and require the latest compiler technology. Some development requires the use of old compilers which makes it difficult to use some unit testing frameworks. In addition many unit testing frameworks assume the code being tested is an application or module that is targeted to the same platform that will ultimately execute the test. Because of this assumption many frameworks require the inclusion of standard C library headers in the code module being tested which may collide with the custom or incomplete implementation of the C library utilized by the code under test. Cmockery only requires a test application is linked with the standard C library which minimizes conflicts with standard C library headers. Also, Cmockery tries to avoid the use of some of the newer features of C compilers. For more information checkout manual.
Perl AppEngine
This project is to get Perl implemented as a supported language on Google App Engine. Want to support Perl? - Read Getting Started.
Perl ProtoBuf
Protocol Buffers for Perl.
Perl Sys::Protect
Perl XS module to override all "dangerous" Perl operations (any operation which interacts with the system). Notably, this module aims to provide the user with an environment identical to the restrictions in place on Google App Engine for Python.
Google App Engine
Google App Engine enables developers to build web applications on the same scalable systems that power our own applications. Google App Engine makes it easy to design scalable applications that grow from one to millions of users without infrastructure headaches. Here are some SDK Release Notes.
JRuby App Engine
JRuby on Google App Engine. With support for the Java Language, it's now possible to run Ruby code on Google App Engine. This project aims to make using JRuby as easy as any of the native App Engine languages. Although Google employees may participate in this project, the code is experimental and is not officially supported by Google.
Android Scripting
The Android Scripting Environment (ASE) brings scripting languages to Android by allowing you to edit and execute scripts and interactive interpreters directly on the Android device. These scripts have access to many of the APIs available to full-fledged Android applications, but with a greatly simplified interface. Want to know more check out FAQ
Eyes Free
Speech Enabled Eyes-Free Android Applications. The Text-To-Speech (TTS) library is allows developers to add speech to their applications. Developers give the TTS object a text string, and the TTS will take care of converting that string to text and speaking it to the user. The TTS library is designed such that different underlying speech engines can be used without affecting the higher level application logic. Currently, a port of the eSpeak engine is available. Here is Getting Started Guide
MAO - An Extensible Micro-Architectural Optimizer
This project seeks to build an infrastructure for micro-architectural optimizations at the instruction level. MAO is a stand alone tool that works on the assembly level. MAO parses the assembly file, perform all optimizations, and re-emit another assembly file. After this, the assembler can be invoked to produce a binary object. MAO reuses much of the code in the GNU Assembler (gas) and needs binutils-2.19 to build correctly. Please see the README.txt file for information on how to build and run MAO. The current MAO version is an early prototype targeting x86.
Google documentation reader
Reading web-based developer documentation is different than browsing typical web pages. As a developer, you probably refer to key technical doc many times per day, and you want it well-organized, easy to navigate, and -- above all -- fast. It works with any open source project hosted on Google Code.
SocialGraph Node Mapper
The Social Graph Node Mapper is a community project to build a portable library to map social networking sites' URLs to and from a new canonical form.
Google visualization
This library makes it easy to implement a Visualization data source so that you can easily chart or visualize your data from any of your data stores. The library implements the Google Visualization API wire protocol and query language. You therefore need write only the code required to make your data available to the library in the form of a data table. This task is made easier by the provision of abstract classes and helper functions.
This is an extension of the Torch3 Machine Learning library for handling various types of Deep Architectures and modifications to the standard Multi-layer Perceptrons:
  • Handles an arbitrary number of fully-connected sigmoidal layers
  • Unsupervised learning of MLPs using various reconstruction costs. Greedy layer-wise learning is available as well.
  • An implementation of the Stacked Denoising Autoencoders
  • A preliminary implementation of collective learning idea, whereby a pair of networks are trained in parallel and are communicating with each other.
One of Google Employees is involved in this project (it is not official Google Project). Documentation is here.
Bunny The Fuzzer
A closed loop, high-performance, general purpose protocol-blind fuzzer for C programs. Uses compiler-level integration to seamlessly inject precise and reliable instrumentation hooks into the traced program. These hooks enable the fuzzer to receive real-time feedback on changes to the function call path, call parameters, and return values in response to variations in input data. This architecture makes it possible to significantly improve the coverage of the testing process without a noticeable performance impact usually associated with other attempts to peek into run-time internals. One of Google Employees is involved in this project (it is not official Google Project). Here are some docs.
Thread weaver
Thread Weaver is a framework for writing multi-threaded unit tests in Java. It provides mechanisms for creating breakpoints within your code, and for halting execution of a thread when a breakpoint is reached. Other threads can then run while the first thread is blocked. This allows you to write repeatable tests for that can check for race conditions and thread safety. Here is user guide.
Google coredumper
A neat tool for creating GDB readable coredumps from multithreaded applications The coredumper library can be compiled into applications to create core dumps of the running program -- without terminating. It supports both single- and multi-threaded core dumps, even if the kernel does not natively support multi-threaded core files.
Rollcage API : Sandboxing for Windows
The Rollcage API can be used to sandbox an application on windows. It is primarily used by Chromium, the open source browser project behind Google Chrome. Here is design overview.
Google gtags
Server-based tags serving for large codebases. Clients in python and for emacs and vim This is an extension to GNU Emacs and X-Emacs TAGS functionality, with a server-side component that narrows down the view of a potentially large TAGS file and serves the narrowed view over the wire for better performance. An Emacs Lisp client, a python client, and vim extensions are supplied.
PP is intended to provide infrastructure and tools to describe and manipulate hardware registers and fields. Once described, it is possible to read and write fields symbolically. This allows one to browse the state of their hardware.
The iotools package provides a set of simple command line tools which allow access to hardware device registers. Supported register interfaces include PCI, IO, memory mapped IO, SMBus, CPUID, and MSR. Also included are some utilities which allow for simple arithmetic, logical, and other operations, If you ever have to debug hardware, you could probably use these tools.
The suite of fast incremental algorithms for machine learning (sofia-ml) can be used for training models for classification or ranking, using several different techniques. This release is intended to aid researchers and practitioners who require fast methods for classification and ranking on large, sparse data sets. Includes methods for learning classification and ranking models, using Pegasos SVM, SGD-SVM, ROMMA, Passive-Aggressive Perceptron, Perceptron with Margins, and Logistic Regression.
A parallel C++ implementation of fast Gibbs sampling of Latent Dirichlet Allocation
stubl - Stateless (IPv6) Tunnel Broker for LANs
Stubl is a transition mechanism for providing a basic level of IPv6 connectivity to individual nodes on a private network. All that's required is a single Linux server with an IPv6 /64 subnet routed to it. The Stubl server consists of a Linux kernel module (stubl.ko) for handling the tunnel packets, and an HTTP server (stubl_http.py) for calculating clients' addresses and providing tunnel setup instructions. The main advantage of Stubl is that it allows a user on the network, running any major OS, to get a working IPv6 connection with nothing but a few lines of shell commands. This makes it very easy for developers to start getting familiar with the protocol, with minimal administrative overhead.
dcsbwt is a data compressor program and library based on the Burrows-Wheeler transform.
DepAn: Dependency visualization and analysis
DepAn is a direct manipulation tool for visualization, analysis, and refactoring of dependencies in large applications. Chekout User Guide
Google mobwrite
MobWrite converts forms and web applications into collaborative environments. Create a simple single-user system, add one line of JavaScript, and instantly get a collaborative system.
An encoder and decoder for the format described in RFC 3284: "The VCDIFF Generic Differencing and Compression Data Format." The encoding strategy is largely based on Bentley-McIlroy 99: "Data Compression Using Long Common Strings." A library with a simple API is included, as well as a command-line executable that can apply the encoder and decoder to source, target, and delta files. A slight variation from the draft standard is defined to allow chunk-by-chunk decoding when only a partial delta file window is available.
Update Engine is a flexible Mac OS X framework that can help developers keep their products up-to-date. It can update nearly any type of software, including Cocoa apps, screen savers, and preference panes. It can even update kernel extensions, regular files, and root-owned applications. Update Engine can even update multiple products just as easily as it can update one.
Google site map generator
Sitemaps are an easy way for webmasters to inform search engines about pages on their sites that are available for crawling. By creating and submitting Sitemaps to search engines, you are more likely to get better freshness and coverage in search engines. Google Sitemap Generator is a tool installed on your web server to generate the Sitemaps automatically. Unlike many other third party Sitemap generation tools, Google Sitemap Generator takes a different approach: it will monitor your web server traffic, and detect updates to your website automatically.
Google Pose Optimizer
The Google pose optimizer (GPO) is a C++ library that allows reconstruction of the pose of a sensor platform (i.e. its position and orientation over time) based on information from sensors such as GPS, accelerometers and rate gyroscopes. GPO does not provide real-time localization in the way that a Kalman filter would, instead it generates the pose as a result of a large off-line optimization. This produces better results. Here is wiki.
Google dnswall
dnswall is a daemon that filters out private IP addresses in DNS responses. It is designed to be used in conjunction with an existing recursive DNS resolver in order to protect networks against DNS rebinding attacks. For details of the attack and various defenses, including dnswall, see http://crypto.stanford.edu/dns/.
Google timezone
Choose from a list of major cities around the world or define your own if it's not on the list. Set one of six layouts for your clocks and choose a design and a background for each clock independently. Add up to 15 clocks and never loose track of time again.
Radiohead ;)
Go here for details
GeN - an open-source system for learning generative models of relational data.

26 Dec 2009

34 projekty Open Source udostępnione przez Google

Google jest jedną z największych firm wspierający ruch wolnego oprogramowania, Gigant z Mountain View w sumie wypuścił ponad 500 projektów jako OpenSource, postaram się przedstawić listę tylko tych ciekawszych, jakie zostały upublicznione.

Przetwarzanie plików tekstowych

Google CRUSH (Custom Reporting Utilities for SHell)
Jest to kolekcja narzędzi przeznaczonych do pracy na plikach TSV/CSV praca z plikami może odbywać się z linii komend oraz plików shellowych

Biblioteki i źródła C++

Google Breakpad
Jest otwarto źródłowym systemem do diagnozowania usterek w oprogramowaniu (crash reporting system).
Google GFlags
GFlags jest biblioteką pozwalającą na przetwarzanie argumentów linii komend. Można powiedzieć, że jest to zastępstwo dla funkcji getopt(), jednakże znacznie zwiększono w niej elastyczność oraz dodano obsługę typów znanych z C++ takich jak string.
Google Glog
Biblioteka Glog pozwala na logowanie działania aplikacji poprzez wygodny interfejs bazujący na potokach (streams). Glog udostepnia też wiele gotowych makro definicji, które można wykorzystać w oprogramowaniu podczas jego debugowania.
Google PerfTools
Narzędzia GooGle PerfTools zostały stworzone dla programistów tak by mogli tworzyć lepsze i solidniejsze aplikacje. Narzędzia te mogą się przydać szczególnie przy budowaniu aplikacji wielowątkowych w języku C++ przy wykorzystaniu mechanizmu szablonów (templates). Projekt zawiera heap-checker, heap-profiler i cpu-profiler.
Google Sparse Hash
Google Sparse Hash jest zoptymalizowaną pod kątem zajętości pamięci implementacja hash mapy.
Omaha - Google Update
Omaha, szerzej znana jako Google Update jest to narzędzie monitorujące zainstalowane oprogramowanie pod kątem aktualności. Do tej pory Omaha jest wykorzystywana w produktach Google na platformę Windows (projekty takie jak Google Chrome i Google Earth), ale Omaha może być też użyta w oprogramowaniu firm trzecich.
Protocol Buffers
Protocol Buffers jest protokół kodowania danych strukturalnych i przygotowanie ich do przesłania w sieci. Sam format jest łatwo rozszerzalny, a jednocześnie bardzo wydajny. Google używa Protocol Buffers praktycznie we wszystkich wewnętrznych usługach RPC. Format ten jest także obsługiwany przez środowisko NetBeans (istnieje plugin wspomagający tworzenie Protocol Buffers).

Sieć Internet

Google Code Pretiffy
Jest to moduł JavaScript oraz plik CSS pozwalający na podświetlanie składni kawałków kodu źródłowego na stronie www. Lexer pozwala na przetwarzanie i kolorowanie składni języków takich jak: C oraz pochodne, Java, Python, Ruby, PHP, VisualBasic, AWK, Bash, SQL, HTML, XML, CSS, Javascript oraz tekstu plików Makefile oraz na sporej części skryptów Perla. Nie są obsługiwane języki: Smalltalk, oraz wszystkie pochodne CAML.
SpriteMe - czyli tworzenie "CSS spirtes"
SpriteMe pozwala na bardzo łatwe tworzenie CSS sprites (połączenia wielu małych plików w jeden obraz, a następnie wycinanie poszczególnych obrazków i osadzanie ich na stronie za pomocą CSS juz po stronie klienta), taka optymalizacja minimalizuje ilość odwołań przeglądarki do serwera www niezbędnych do załadowania całej strony - przyspieszając czas ładowania strony. Usługa jest dostępna także pod adresem: http://spriteme.org/.
Reducisaurus jest usługą pozwalającą na zmniejszenie oraz serwowanie plików CSS i JavaScript. Cała usługa jest oparta na Systemie kompresji YUI i działa na platformie AppEngine.
JaikuEngine jest usługą mikroblogową działającą na platformie AppEngine. JaikuEngine napędza serwis Jaiku.com. Istnieje także mobilna wersja klienta.
Selector Shell
Pozwala na stworzenie "powłoki" wewnątrz przeglądarki tak by możliwe było testowanie selektorów CSS.
Google Feed Server
Google Feed Server jest otwarto źródłową implementacją serwera Atom Publishing Protocol, serwer ten bazuje na frameworku Apache Abdera. Google Feed Server dostarcza prostego back-endu dla adapterów danych, które umożliwiają programistom szybkie stworzenie kanału Atom z dostępnych danych - takich jak baza danych.
NameBench pozwala na sprawdzenie prędkości różnych serwerów DNS, dane do testów aplikacja może pobrać z historii przeglądarki, zrzutów tcpdumpa lub standardowych zbiorów danych. Namebench jest całkowicie darmowy i nie modyfikuje systemu w żaden sposób. Projekt ten został rozpoczęty w ramach 20% czasu na własne projekty w Google. Dobrze jest przy jet okazji wspomnieć, że firma Google udostępnia własne serwery DNS (cachujące) są one dostępne pod adresami ip: i
Rat Proxy
RatProxy jest półautomatycznym, pasywnym narzędziem do badania bezpieczeństwa usług internetowych. Narzędzie to zostało zoptymalizowane do wykrywania i automatycznego kategoryzowania potencjalnych problemów związanych z bezpieczeństwem, poprzez obserwację ruchu generowanego przez użytkowników. Narzędzie to powstało by ułatwić analizę bezpieczeństwa serwisu w skomplikowanych środowiskach web 2.0.
TopDraw jest programem do generowania obrazków, poprzez ich opis w języku podobnym do JavaScriptu. TopDraw może stworzyć bardzo zaawansowane i interesujące kompozycje z obrazów. Najfajniejszą częścią projektu jest to, że posiada wbudowany mechanizm tworzenia obrazów oraz instalowania ich jako tapeta. W pakiecie jest także przeglądarka która może być zainstalowana w pasku narzędzi i może uruchamiać skrypt generujący obraz co określony interwał czasowy.
EtherPad jest internetowym edytorem tekstu pozwalającym na pracę nad jednym dokumentem do ośmiu osób w czasie rzeczywistym, każda z osób może edytować dokument w tym samym czasie i podejrzeć wszystkie zmiany innych uczestników (każdy z użytkowników ma swój kolor - zobacz screencast). Uczestnicy mogą zapisać zmiany w dokumencie w każdej chwili. Aplikacja została stworzona przez firmę AppJet i działa w oparciu o JavaScript, Javę, serwer Comet.
Chromium jest otwarto źródłowym projektem przeglądarki Google Chrome (od wersji sygnowanej przez Google niewiele się różni tak naprawdę).
V8 Google's open source JavaScript engine
V8 jest interpreterem języka JavaScript napisanym całkowicie w C++ jest on wykorzystywany w Google Chrome. V8 obsługuje ECMAScript (czyli JavaScript) wg. specyfikacji ECMA-262 (edycja: 3). Działa poprawnie pod Windows XP i Vista, Mac OS X 10.5 (Leopard) oraz Linux na architekturze IA-32 i ARM.
Chromium OS
Celem projektu jest zbudowanie systemu operacyjnego, który dostarcza użytkownikowi szybkiej, prostej i bezpiecznej platformy do przeglądania i tworzenia materiałów w sieci internet. Chromium OS jest projektem dostępnym wraz z kodem źródłowym, a jego źródła są dostępne pod adresem: http://git.chromium.org/ src
Android jest pierwszą darmową i w pełni konfigurowalną platformą mobilną o otwartym źródle. Android oferuje pełen stos rozwiązań: system operacyjny, middleware, oraz podstawowe aplikacje mobilne. Platforma Android zawiera bogaty zbiór różnych API, pozwalając programistom tworzyć ciekawe aplikacje, które mogą się integrować z systemem operacyjnym na urządzeniu mobilnym.

Narzędzia do obsługi serwerów bazodanowych - MySQL

Google MySQL Tools
Google mMAIM
Celem narzędzia mMAIM jest monitorowane i analiza serwerów bazodanowych opartych na MySQL. Narzędzie to łatwo może być zintegrowane z każdym środowiskiem w którym działają bazy MySQL. Pozwala na wyświetlenie stanu replikacji Master/Slave, wyświetlanie statystyk całej bazy, wyświetlenie statystyk ze wszystkich komend typu 'SHOW' oraz wiele więcej. Pakiet ten zawiera wiele narzędzi do zarządzania, monitorowania i zwiększania wydajności baz danych opartych o MySQL, oryginalnie projekt był tworzony przez Google.


Stressful Application Test (stressapptest)
Stressful Application Test (w Unixie: stressapptest) jest to narzędzie pozwalające na wytworzenie sytuacji w której komputer jest poddawany dużym obciążeniom by sprawdzić jak zachowują się poszczególne części zestawu komputerowego. Narzędzie generuje przepływ danych z procesora do pamięci, oraz sporą ilość operacji wejścia/wyjścia. Jest ono używane w Google, a obecnie jest dostępne jako projekt open source na licencji Apache 2.0.
Pop and IMAP Troubleshooter
POP and IMAP troubleshooter pozwala na zdiagnozowanie i rozwiązanie problemów z połączeniem do serwerów poczty elektronicznej z komputerów klientów przez protokół POP3 i IMAP. Program ten może odczytywać pliki konfiguracyjne klientów pocztowych (Outlook, Windows Mail, Thunderbird, itp.), sprawdzać poszczególne ustawienia, a następnie spróbować wykonać połączenia POP, IMAP, SMTP używając tych ustawień.
Openduckbill jest programem konsolowym do backupu danych w systemu Linux. Pozwala on na monitorowanie zmian w plikach i katalogach oznaczonych jako "backup", oraz ich synchronizację do lokalnego katalogu backupu, zdalnego udziału NFS lub przesłanie ich na serwer z wykorzystaniem komendy rsync.
ZXing ("zebra crossing") jest biblioteką służącą do rozpoznawanie kodów kreskowych 1D i 2D. Biblioteka jest dostępna wraz z kodem źródłowym, obsługuje wiele typów obrazów, została ona stworzona w Javie, jej głównym celem jest udostępnianie możliwości przetwarzania kodów kreskowych bez komunikacji z serwerem na urządzeniach moblinych takich jak telefony komórkowe. O ile się nie mylę jest ona wykorzystana w Platformie Android.
Tesseract OCR Engine
Silnik rozpoznawania tekstu Tesseract był jednym z 3 najlepszych w 1995 roku wg. testu dokładności UNLV. Pomiędzy rokiem 1995, a 2006 nie było w nim wiele modyfikacji, ale pomimo tego najprawdopodobniej jest on jednym z najdokładniejszych systemów rozpoznawania tekstu wydanych jako open-source. Kod źródłowy pozwala na odczyt i przetworzenie danych zapisanych w postaci binarnej - obrazy w odcieniach szarości lub kolorowe mogą być przetworzone na tekst. Do projektu dołączony jest narzędzie odczytujące nieskompresowane obrazy w formacie TIFF.
Neatx - Open Source NX server
Neatx jest projektem Open Source podobnym do serwera NX firmy NoMachine. Protokół NX wydajnością bije na głowę VNC, co przy niezbyt szybkim łączu jest bardzo korzystne. Główne różnice pomiędzy NX, a VNC:
  • NX jest klientem X11 a nie przesyła obrazy jak VNC
  • NX działa z X, VNC i Remote Desktop (Windows)
  • NX buforuje dane
  • NX jest prostszy w konfiguracji
Alternatywnym projektem może być FreeNx
Jest to wersja maszyny SVM, która może być uruchomiona równolegle na wielu maszynach. Szczegóły zostały opisane w artykule.
The GO programming language
Nowy język programowania stworzony w Google. Składnia języka jest podobna do C i Pythona.
Google styleguide
Zbiór reguł wg. których jest pisany kod aplikacji w Google.


Google jest jedną z najbardziej aktywnych firm wspierających ruch wolnego oprogramowania, publikując część swoich projektów za darmo w sieci, co więcej każdego roku organizuje Summer Of Code - projekt wspierania Otwartego Oprogramowania, w ramach którego studenci realizują różne zadania na rzecz już istniejących projektów Open Source i otrzymują za to stypendium ufundowane przez Google.

24 Dec 2009

Microsoft LifeCAM NX-3000 on Linux and Skype

Microsoft LifeCAM NX-3000 is nice little web cam, which works on GNU Linux (this is strange). Some people have problems with this webcam and Skype.

How to solve problems with Skype

  • Download newest version of Skype
  • Install it dpkg --force-all -i skype-*.deb
  • Configure pulseaudio (using GUI)

    Go to Audio settings

    Check if you have web cam detected (if not install additional kernel modules)

    Change sound input device

  • Start Skype for Linux, and start chatting with friends
PulseAudio manual, some helpful advices but running this was very simple.

Other links


New version of Skype for Linux is finally ok, my PC during Skype conversation is finally quiet and I've got many resources to use (Skype is evolving in good direction).

23 Dec 2009

Manipulacja tekstem w Bashu

Bash jest całkiem dobrym narzędziem jeżeli chodzi o manipulację tekstem (oczywiście nie może się umywać do Perla/Seda/AWKa/Pythona), ale sporo funkcji ma zaimplementowanych, wiele osób nawet sobie nie zdaje z tego sprawy. W tej notce spróbuję część tej funkcjonalności przedstawić:

19 Dec 2009

BitLocker without TPM Module in Windows7

Windows BitLocker can store "password to disk" on USB stick, not only in TPM hardware module. To make it happen you have to activate some advanced settings (why there are no dialog like: "save my key on usb disk"?)

How to save Windows7 BitLocker key on USB stick?

  • Click: Start | Search, type gpedit.msc and hit enter
  • Navigate to:
    • Local Computer Policy
    • + Computer Configuration
    • ++ Administrative Templates
    • +++ Windows Components
    • ++++ Operating Systems Drives
    • +++++ BitLocker Drive Encryption -> Require Additional Authentication at Startup
  • Change those two keys to true
  • Rerun the BitLocker Wizard

Once you have allowed BitLocker without TPM, the wizard in the BitLocker Drive Preparation will let you store the Startup Key on a USB flash drive. It also allows you to save a Recovery Key, which you will need if you have lost your USB stick.

You will then be asked whether you want to run a BitLocker System Check. If you agree, your computer will be restarted to check whether the USB device is available during the boot-up process (that is nice idea).

This super mini howto was based on: Windows7 BitLocker Review, and it is posted mostly for me (I don't remember the path in gpedit.msc :().


There are also other cross platform ways to secure your data, one of them is TrueCrypt, which can be compared with BitLocker, if you're interested how it really work you can read article: How does TrueCrypt work - explained.

Additional links

Links below are not connected with BitLocker but I think it may be useful for me someday:

14 Dec 2009

Pomysł na serwer plików

Od pewnego czasu mogę w ciągu dnia korzystać z macierzy dyskowej (NAS) podłączonej do sieci Ethernet, całość to zamknięte w bardzo małej obudowie dwa dyski spięte w RAID1 :). Powiem że takie rozwiązanie jest bardzo wygodne do współdzielenia plików pomiędzy paroma komputerami (polecam!), lub po prostu jako miejsce przechowywania dużych ilości danych (backup, storage). Macierz z jakiej korzystam ma sporo różnych funkcji :), ale w sumie pomyślałem, że spiszę tutaj te które bym chciał mieć w takim urządzeniu.

13 Dec 2009

Electronic devices - rapid prototyping environments

There are many platforms that can be used for rapid prototyping of electronics devices, in this article I will write about two of them which are quite popular.


Arduino is an open-source electronics prototyping platform based on flexible, easy-to-use hardware and software. It's intended for artists, designers, hobbyists, and anyone interested in creating interactive objects or environments.
What is inside Arduino?
version 2009
  • ATmega168/ATmega328
  • 16 KB Flash Memory (ATmega168)/32 KB Flash Memory (ATmega328) (2 KB are used by bootloader)
  • 1 KB SRAM(ATmega168)/2 KB SRAM (ATmega328)
  • 512 bytes EEPROM (ATmega168)/1 KB EEPROM (ATmega328)
  • 14 digital input/output pins (of which 6 can be used as PWM outputs)
  • 6 analog inputs
  • 16 MHz crystal oscillator
  • USB connection
  • power jack
  • built-in LED
  • ICSP header
  • reset button
  • I2C support
Some helpful links on Arduino
Inspirations for cool projects
Projects listed above, were made using ATmega32:

Sun Spot

Project Sun SPOT (Small Programmable Object Technology) was created to encourage the development of new applications and devices. It is designed from the ground up to allow programmers who never before worked with embedded devices to think beyond the keyboard, mouse and screen and write programs that interact with each other, the environment and their users in completely new ways. A Java programmer can use standard Java development tools such as NetBeans to write code.
What is Sun Spot
Read: What is SunSPOT - Introduction, it has many nice features
  • Embedded Development Platform
  • Easy to program - Java top to bottom
  • It has Wireless Communication (Overlay Network - CTP, IPv6/LowPan ; Mesh Networking - AODV, LQRP) ; Multi-hop Over the Air Programming
  • Built in Lithium Ion battery charged through USB
In Kit there are two SUN Spots + base station (base station is only processor board without sensors).
What is inside
  • 180 MHz 32 bit ARM920T z 512K RAM SRAM i 4M Flash.
  • 2.4 GHz IEEE 802.15.4 support
  • USB interface
  • light sensor ; temperature sensor ; 8 colour leds ; inputs/outputs ; ADC ; 2 buttons ; accelerometer
It is based on open hardware and schematics. Schematics can be downloaded from here, and software from here. Sun Spot runs Squawk VM, software is written in Java (there is special version of NetBeans IDE) - checkout sources to get some details. There is some nice tutorial: how to use emulator.
Sample projects based on Sun Spot


  • Great for start - cheap (remember you can damage device during development process)
  • A lot of tutorials, references, sample projects
  • Many people use this!
  • Some SDK is provided
Sun Spot
  • Sun Spot is more powerful out of the box
  • Perfect for creating mesh networks, sensor network or something like that
  • Programmable in Java - you can use NetBeans
  • It is not good for start - quite expensive, but has many feature built in
  • Hardware is inside case - ready to use outdoors
It is a pity that I don't have enough time to start playing with this stuff.

3 Dec 2009

Ubuntu screen profiles in SUSE Linux

What is this all about?

I like the look of screen application in Ubuntu, this feature is provided by screen-profiles package, which is not present in SUSE Linux (SLES 11).

Before start

Run zypper install newt newt-python to install dependencies.


contents of install-screen-profiles.sh


# fetch package
wget http://us.archive.ubuntu.com/ubuntu/pool/main/s/screen-profiles/$PACKAGE_NAME

mkdir $WORKDIR


# unpack

if [ ! -x "/usr/bin/screen.real" ] ; then
   sudo mv /usr/bin/screen /usr/bin/screen.real
   echo "Unable to write /usr/bin/screen.real - this can break your screen app"

# unpack
tar -xvzf data.tar.gz

# Intall it
sudo find  usr -type f -exec install -D -m 755 {} "/{}" \;
sudo find  var -type f -exec install -D -m 755 {} "/{}" \;

echo "You may want to delete "$WORKDIR" (rm -rf $WORKDIR) "
Run bash install-screen-profiles.sh


  • If you want to start screen after you login run bash /usr/share/screen-profiles/screen-launcher-install else bash /usr/share/screen-profiles/screen-launcher-uninstall.
  • F9 works a little bit strange ... - so use it at your own risk.

28 Nov 2009

Howto send email from bash on DLink DNS-323

This can be done only on hacked dns-323 ffp/fun_plug. Many people have problem sending emails from shell scripts using DNS-323, it is possible without installing additional software or writing scritps. Sending emails right from bash scripts in DNS-323 is quite easy, but it took me some time to make it work.

23 Nov 2009

Getch() function in Linux

Some long time ago I was searching for GetCh() implementation on Linux, and I've entered google new groups and found some good stuff. Today durning some cleanings on my system I've found saved html page with this, so i will paste it here (this blog is my notepad recenty).

64k Demos

Not only 64k but mostly, first few were found some long time ago, rest is quite new.

Almost winter

I would like to ride snowboard/ski someday like these guys, the music is also quite nice.

17 Nov 2009

Some helpful Linux software


Some tools which could be helpful in network and system troubleshooting


Some tools which may be helpfull to deal with hardware/security

Database management

  • mtop - MySQL terminal based query monitor
  • mytop - top like query monitor for MySQL
  • ptop - PostgreSQL performance monitoring tool akin to top
Did I missed some software which may be helpfull?

Gentoo stuff :P

If you're working on Gentoo box you should think about:

Disable buzzer

# in ~/.bashrc
setterm -blength 0
Did I missed something? If there is some cool stuff that I could need, please write it in comments below.

Linux IMAP Mail Notifier

If you want be notified about new mail in Linux, only using IMAP protocol you should consider using mail-notification - it's small application for Gnome, it looks like this: And really works quite nice, just test it mail-notification --sm-disable

Linux Tip: Color enabled pager - less

Recently I was using a command line tool which was generating many lines of color text. The output was displayed so fast on my xterm, that I couldn't read it. So I thought, that I could use "| less" pager to see what's up, and I was wrong :( - less "out of the box" doesn't support colors. I've tried most pager but I prefer less.

... but there is a way!

Less doesn't support colors "as it is", but there are some hacks. Thanks rha7dotcom.
export LESS="-RSM~gIsw"
  • R - Raw color codes in output (don't remove color codes)
  • S - Don't wrap lines, just cut off too long text
  • M - Long prompts ("Line X of Y")
  • ~ - Don't show those weird ~ symbols on lines after EOF
  • g - Highlight results when searching with slash key (/)
  • I - Case insensitive search
  • s - Squeeze empty lines to one
  • w - Highlight first line after PgDn
Remember the tip with export LESS works only if you software you want to page uses RAW ASCII colors not those ncursed based!

Color man pages using less pager

Thanks Nion
export LESS_TERMCAP_mb=$'\E[01;31m'
export LESS_TERMCAP_md=$'\E[01;31m'
export LESS_TERMCAP_me=$'\E[0m'
export LESS_TERMCAP_se=$'\E[0m'
export LESS_TERMCAP_so=$'\E[01;44;33m'
export LESS_TERMCAP_ue=$'\E[0m'
export LESS_TERMCAP_us=$'\E[01;32m'
To make it available for full time add this entries to your ~/.bashrc or ~/.${SHELL}rc. Hope this helps someone.

3 Nov 2009

MySQL and UTF-8 - locales and some advanced settings

Default locales, and collation in MySQL since 4.1 is latin1_swedish_ci, it works well but if you have some polish texts it's not so good, there is way to change defaults, and it is quite easy. After installation of MySQL edit your config file (default location of this file is /etc/mysql/my.cnf).

20 Oct 2009

OKI MB290 Fax - password reset (web server)

To reset password in OKI MB290 Fax/Printer/Scaner/Copier you have to:
  1. Press up arrow - enter menu
  2. Press * (star)
  3. Press # (hash)
  4. Find SOS22 and push OK putton when hilighted
  5. There should be 0001... and 4th 1 from leftside indicates that web server requires password login
  6. Change 4th digit to 0, and press OK
  7. You are done
It's very simple when you know what SOS is for what ... if you don't know call tech support and don't be a 'hacker' - Changing some strange flags can damage entire OKI MB290. Warning: Use this tip only for your own responsibility, don't blame me if your device will be dead.

15 Oct 2009

One note for enviroment (Blog Action Day)

I have joined a Blog Action Day.

Today I should write something about climate change, but I have found some stuff related, and probably this will be the best idea to repost it here.

Renewable sources of energy has some effect to local environment (birds are falling to large wind turbines, and many solar power plants causes local temperature raise by few degrees). I think that we should use nuclear power plants in future, because it will be very efficient and independent power source - as far I know it hasn't got much impact on climate change (some report has been written about it influence to climate change - maybe it is sponsored?).

One thing is sure, we should start planing, or even acting, what to do next when current energy sources will be unavailable and our climate will be so changed that we couldn't live on Earth.

29 Sep 2009

Eclipse is not so great for Perl developers

Of course there is EPIC which is great - real IDE for writing in Perl, but when it comes to update whole IDE, all gets frustrating so what can I say: I really do not like the update system in Eclipse, for me is much faster to make:
rm -rf ~/projects/deps-eclipse/* ~projects/workspace-eclipse/.metadata/
Goto: http://www.eclipse.org/downloads/, download whole new source and install new version plus additional plugins (I use only SVN and EPIC). My problems are probably caused by high load on update server, which makes download of updates real slow, but this probably can be fixed in some way - for example adding some mirrors?

NetBeans support for Perl language is needed!

Dear NetBeans Team, please provide support for Perl (I really need only syntax highlighting, PerlCritic/PerlTidy, Error Reporting). This may gain some new users to IDE (last stuff about real alternative to Epic was from 2000, and it was commercial software. ARGH!) This is almost done in this plugin, but not all: http://netbeans.mojgorod.ru/perl.html. There was also other project (http://code.google.com/p/nbperl/) but it died even before starting.

Who cares about Perl6, new Parrot virtual machine if there is no real IDE support - think about it, Perl Developers. Or I should learn other scripting language which is supported by NetBeans IDE? Comments (in both Polish and English) are welcome.

28 Sep 2009

Images from space by NASA

If you are interested to see space (but not in Google Earth), you can download some pictures from space.com (thanks Elliot for posting URL). But if you are interested in images of Earth some high-resolution pictures can be found at http://visibleearth.nasa.gov/ (photos are REAL HUGE - resolution: 86400 x 43200) - Direct link to images.

Howto execute system commands in Perl and possible danger

There are various ways to run system subproces in Perl. I will mention only 7 - few native (exec(), system, qx{}/``) and few which use additional libraries (Open("|"), IPC::Open2, IPC::Open3, IPC::Cmd) which are in fact in standard Perl distribution so they can be used without worries.


Most people think that running system command from Perl is only done by system() or exec(), but there are many ways to achieve this task - some are better some are worse. Each of them has different performance, even specific usage of function could increase/decrease performance. This post is written only to help programmer choose right solution for task (solution secure, flexible and with best performance).

Note: I am using in this article some (quite much) text which is copied from PerlDoc - it will be in tag: <cite>.

Executing system command - possible ways

  1. exec() - PerlDoc Page
  2. system() - PerlDoc Page
  3. qx{}/`` - PerlDoc Page
  4. Open(' |') - PerlDoc Page
  5. IPC::Open2 - PerlDoc Page
  6. IPC::Open3 - PerlDoc Page
  7. IPC::Cmd - PerlDoc Page
  8. IPC::Run - PerlDoc Page - not covered in this article (it's not part of standard Perl distribution) on Unix/Linux - AFAIK
If you don't want to scroll to summary or conclusion click hyperlink.

23 Sep 2009

Wget like progress bar in console

Some time ago I was writing about notifying user that our software does not hang out, today I will also write about this. It is easy to create progress bar, and there is a numerous modules done (Oreily.com). But creating a simple progress bar (which looks like progress bar in wget) is pretty straight forward.

Output with detailed progress

Warning: This is only example after each execution of progressBar "\n" is inserted!. In final code this will be one animating line!
johny@jambia:~$ perl pgbar.pl 
Starting Hello
[i] Hello |>                                                                     | 0 of 10 (  0%)
[i] Hello |======>                                                               | 1 of 10 ( 10%)
[i] Hello |=============>                                                        | 2 of 10 ( 20%)
[i] Hello |====================>                                                 | 3 of 10 ( 30%)
[i] Hello |===========================>                                          | 4 of 10 ( 40%)
[i] Hello |==================================>                                   | 5 of 10 ( 50%)
[i] Hello |=========================================>                            | 6 of 10 ( 60%)
[i] Hello |================================================>                     | 7 of 10 ( 70%)
[i] Hello |=======================================================>              | 8 of 10 ( 80%)
[i] Hello |==============================================================>       | 9 of 10 ( 90%)
[i] Hello |=====================================================================>|10 of 10 (100%)

Hello started

Output normal process bar

Warning: This is only example after each execution of progressBar "\n" is inserted!. In final code this will be one animating line!
johny@jambia:~$ perl pgbar.pl 
Starting Hello
[i] Hello |>                                                                             | (  0%)
[i] Hello |=======>                                                                      | ( 10%)
[i] Hello |===============>                                                              | ( 20%)
[i] Hello |=======================>                                                      | ( 30%)
[i] Hello |==============================>                                               | ( 40%)
[i] Hello |======================================>                                       | ( 50%)
[i] Hello |==============================================>                               | ( 60%)
[i] Hello |=====================================================>                        | ( 70%)
[i] Hello |=============================================================>                | ( 80%)
[i] Hello |=====================================================================>        | ( 90%)
[i] Hello |=============================================================================>| (100%)

Hello started