We use Gearman for managing queue system. This covers why we should use a queue in many situations on web-based interface as well as server-side application.
11. Introducing Gearman
In LiveJournal, many photos had uploaded
every day and it lead to a heavy load of image
processing, and this was a motivation to build
such a queue system.
● Yahoo!: 120+ servers, 12M jobs/day
● Digg: 45+ servers, 400K jobs/day
● LiveJournal, SixApart, DealNews, Xing.com,
and many others. - Expert PHP and MySQL -
Andrew et al, (2010, Wrox)
● Grooveshark, GoDaddy.com, IMcompany
12. Features of Gearman
● Open Source
● Simple & Fast (rewritten in C)
● Support a variety of languages
: build Worker in Python, Client in PHP
● Flexible
● Load Balance
● Failover
14. Architecture
Acks the job, finds all sleeping workers
Awake, asks for jobs to server
Gearman Job Server
Connect, submit a job
Sends a 'noop' command to
wake them up
Client
Worker
15. Installation
● Compile (for PHP APIs)
tar xzvf gearmand-X.Y.tar. ● Pecl Extension
gz sudo pecl install gearman
cd gearmand-X.Y
./configure ● Add below to php.ini
make extension="gearman.so"
make install
● Start Server
$ gearmand -d
16. Use Cases
- Crawling a website
- Image Manipulation
- Push Notification
- Sending Email/Messages
- File verification/compressing
- Fetching RSS Feeds
- Indexing on Search Engine
19. Samples - Monitoring
A good tool for monitoring gearman, is available at
https://github.com/yugene/Gearman-Monitor
20. Result
Worker #1 Worker #2
The incomplete job will re-queue to available workers
for fault-tolerance
21. Motivation
● At the beginning state, we run 3 computers
for crawling each school's information.
(articles, schedules of the school)
● One job at a time, too much time to finish all
of them, sometimes machines do the same
job as the others do.
● That was a motivation to make a job queue
system that could do jobs in parallel. And
we've found Gearman!
22. Gearman in IMcompany
But there were some challenges!
● How many workers should be up for a
server? (How efficiently leverage the load?)
● How can we handle unexpected termination
of workers?
● What if the server's resource is exhausted
due to the jobs that given by workers?
(Then the server would not respond to
other's requests/connections related to
WEB, SVN, MySQL)
24. Reported bugs when using PHP
Bug #63041 "Failed to set exception option" on
connect when any gearman server is down
https://bugs.php.net/bug.php?id=63041
Bug #63648 Gearman worker stops with
segfault after 1-2 hour of working
https://bugs.php.net/bug.php?id=63648
25. Supervisord for sanity
"PHP was not built for long running request"
"Sometimes it occurs memory leaks"
Supervisord helps you in above cases!
- Auto restart the processes based on custom
configurations
* Installation guide - http://www.masnun.com/2011/11/02/gearman-
php-and-supervisor-processing-background-jobs-with-sanity.html
26. Exceptional Case #2
PHP sometimes slows down after hundreds of
executions, kill it off if you know this will
happen. - Mike Willbanks, "Gearman: A Job Server made
for Scale"
28. What We Learned
● Gearman's queue list is unstable so
persistent queueing was highly needed in
our system
● Integrating MySQL with Gearman was failed
in both 1.0.2, 0.34
● Tried SQLite, but performance was very
poor
Do NOT Reserve Too Much Jobs in a Queue
29. Also We've Tried...
● Firing queueing jobs over HTTP request is
sometimes not working and may lead to
freezing the server eventually
● And doesn't support additional functions for
the HTTP connection such as authentication
● And is not customizable
Gearman Seems Too Young at This Moment
30. Limitations
● Queue makes no guarantees - use MySQL,
memcached, Redis, PostgreSQL, etc..
● There are few administration tools
● Jobs don't expire
● If a job is dropped, the client is never be
notified
-from "http://inside.godaddy.com/cloud-processing-with-
gearman/"
32. We're hiring!
● Work in Daejeon, Korea
● Flexible, Small Company
● Excellent Benefits
● We Need Senior Hackers
Find more information at http://iamcompany.net/
Thank you!
Any questions?