18. Server Architecture User Profile Server web server web server Web Server Scripts Load Balancer Ad Server Web Services Apache
19.
20.
21.
22.
23.
24.
25.
26.
Notas do Editor
Abstract: In 2002, Yahoo selected PHP for Web site development and began to phase out its own proprietary server-side scripting language. Three years later, Michael Radwin reflects on how the switch to PHP offered both technical challenges and productivity increases. The first part of the presentation offers a look inside Yahoo's decision-making process to adopt an open-source scripting language. Radwin addresses why Yahoo selected PHP over other languages, focusing on the performance and stability required to serve billions of page views a day. In the second part, Radwin discusses Yahoo's PHP development methodology, which has enabled its engineers to rapidly implement features while still creating software that is maintainable over long periods of time. Biography: Michael J. Radwin is an engineering manager for Yahoo's Infrastructure Software group. His team develops and supports Web platform technologies such as Apache, PHP, and MySQL, and more recently SOAP/REST toolkits. Radwin has been hacking on Apache since 1998 in high-performance environments and his team has been instrumental in helping Yahoo migrate from proprietary to open source software.
Numbers from Q3 2005 Yahoo! Earnings October 18, 2005
Compared PHP 4.1.2, mod_perl, yScript (Yahoo proprietary) Pentium III 800Mhz, 512M RAM, FreeBSD 4.3 (average for early 2002) Sample app: 33K input script, 41K output Included and evaluated 3 other files Header, navbar, footer Arithmetic, regex, echo variables Pseudo-personalization (“Hello, mradwin”) A few calls to C++ extension Fetch user profile from profile server Insert advertisements from adserver
Yahoo property (sports, finance, personals, etc…) Load balancer - which server can most handle requests coming in based on algorithm (round robin, least connections, etc..) Running on server are bunch of PHP scripts. Can make remote calls to relational databases, or to other web services.
Web pages go regular Apache htdocs dir http://login.yahoo.com/config/login?.intl=dk /usr/local/share/htdocs/dk/login.php Business logic goes in PEAR directory /usr/local/share/pear/HTML/Form.php /usr/local/share/pear/Yahoo/Sports/Teams.php
Profile with APD to see where your hot spots are. If you see a function being called 8,000 times on one page, that might be a good candidate to port to C Focus on scripts (or include files) that get hit a lot Don’t bother optimizing a script that only gets called once in a while Examples of candidates for extensions Distributed locking i18n Advertisements UDB (user database) Cookies DBM-like flat files Security Input Filtering