O slideshow foi denunciado.
Utilizamos seu perfil e dados de atividades no LinkedIn para personalizar e exibir anúncios mais relevantes. Altere suas preferências de anúncios quando desejar.

Financial security and machine learning

Financial security and machine learning

  • Entre para ver os comentários

Financial security and machine learning

  1. 1. Financial Security &Machine Learning 김민경 daengky@naver.com 2015.02.10
  2. 2. 2 • Machine Learning Meetup Outline • Intorduction • Immune System • Machine Learning • Solutions
  3. 3. 3 • Machine Learning Meetup Introduction 신제윤 금융위원장은 금융보안을 위해 모든 금융권이 이상거 래탐지시스템(FDS) 구축을 완료해야 한다고 촉구했다. "핀테크 활성화 방안을 추진하기 위해서 반드시 전제돼야 할 사항은 보안의 중요성"이라며 "정보보안이 확보되지 않 은 서비스는 결국 사상누각이 될 것"이라고 우려했다. 그는 핀테크(Fintech) 추진 방안과 관련해서는 "오프라인 위주의 금융제도 개편을 통해 핀테크 기술이 금융에 자연스 럽게 접목될 수 있도록 지원할 것"이라며 "전자금융업종 규 율을 재설계토록 하겠다"고 밝혔다. Motive
  4. 4. 4 • Machine Learning Meetup FinTech Introduction
  5. 5. 5 • Machine Learning Meetup Fraud Detection Basics • Outlier Detection • detecting data points that don’t follow the trends and patters in the data • rule base detection • anomaly detection • Two approaches for treating input • focus on instance of data point • focus on sequence of data points • Three kinds of algorithms • building a model out of data • using data directly. • immunse system base on temporal data • Real time fraud detection • feasible with model based approach • A model is built with batch processing of training data • A real time stream processor uses the model and makes predictions in real time Introduction
  6. 6. 6 • Machine Learning Meetup Economy Imperative • Not worth spending $200m to stop $20m fraud • The Pareto principle • fthe first 50% of fraud is easy to stop • next 25% takes the same effort • next 12.5% takes the same effort • Resources available for fraud detection are always limited • around 3% of police resources go on fraud ? • this will not significantly increase • If we cannot outspend the fraudsters we must out-think them Introduction
  7. 7. 7 • Machine Learning Meetup Bigdata Ecosystem Open Source Bigdata Ecosystem • Query (NOSQL) : Cassandra, HBase, MongoDB and more • Query (SQL) : Hive, Stinger, Impala, Presto, Shark • Advanced Analytic : Hadoop, Spark,H2O • Real time : Storm, Samza, S4, Spark Streaming Introduction
  8. 8. 8 • Machine Learning Meetup Bigdata Ecosystem Introduction Seldon infrastructure •Real-Time Layer : responsible for handling the live predictive API requests. •Storage Layer : various types of storage used by other components. •Near time / Offline Layer : components that run compute intensive or otherwise non-realtime jobs. •Stats layer : components to monitor and analyze the running system.
  9. 9. 9 • Machine Learning Meetup Immune Systems AIS are adaptive systems inspired by theoretical immunology and observed immune functions, principles and models, which are applied to complex problem domains •Immune system needs to be able to differentiate between self and non-self cells •may result in cell death therefore • Some kind of positive selection(Clonal Selection) • Some kind of negative selection Aritifical Immune Systems
  10. 10. 10 • Machine Learning Meetup Immune Systems Simple View
  11. 11. 11 • Machine Learning Meetup Immune Systems 무과립성 백혈구(無顆粒性 白血球, agranulocyte)의 일종으로 면역 기능 관여하며 전체 백혈구 중에서도 30%를 차지한다. •T세포(T cell) •보조 T세포(Helper T cell) •세포독성 T세포(killer T cell) •억제 T세포(suppressor T cell) •B세포(B cell) •NK세포(Natural killer cell, NK cell) Lymphocyte(림프구)
  12. 12. 12 • Machine Learning Meetup Immune Systems B 세포(B細胞, B cell)는 림프구 중 항체를 생산하는 세포 B cell
  13. 13. 13 • Machine Learning Meetup Immune Systems T세포(T細胞, T cell) 또는 T림프구(T lymphocyte)는 항원 특이적인 적 응 면역을 주관하는 림프구의 하나이다. 가슴샘(Thymus)에서 성숙되기 때문 에 첫글자를 따서 T세포라는 이름이 붙었다. 전체 림프구 중 약 4분의 3이 T 세포 T세포는 아직 항원을 만나지 못한 미접촉 T세포와, 항원을 만나 성숙한 효과 T 세포(보조 T세포, 세포독성 T세포, 자연살상 T세포), 그리고 기억 T세포로 분류 T cell
  14. 14. 14 • Machine Learning Meetup Immune Systems each antibody can recognize a single antigen Antibody, Antigen
  15. 15. 15 • Machine Learning Meetup Immune Systems Biological Immune System
  16. 16. 16 • Machine Learning Meetup Danger Theory •Proposed by Polly Matzinger, around 1995 •Traditional self/non-self theory doesn’t always match observations •Immune system always responds to non-self •Immune system always tolerates self •Antigen-presenting cell(APC):T-cell activation by APCs •Danger theory relates innate and adaptive immune systems •Tissues induce tolerance towards themselves •Tissues protect themselves and select class of response Immune Systems
  17. 17. 17 • Machine Learning Meetup •Tissues induce tolerance by •Lymphocytes receive 2 signals •antigen/lymphocyte binding •antigen is properly presented by APC •Signal 1 WITHOUT signal 2 : lymphocyte death •Tissues protect themselves •Alarm Signals activate APCs •Alarm signals come from •Cells that die unnaturally •Cells under stress •APCs activate lymphocytes •Tissues dictate response type •Alarm signals may convey information Danger Theory Immune Systems
  18. 18. 18 • Machine Learning Meetup Danger Theory Immune Systems
  19. 19. 19 • Machine Learning Meetup Artificial Immune Systems Immune Systems •Vectors Ab = {Ab1, Ab2, ..., AbL} Ag = {Ag1, Ag2, ..., AgL} •Real-valued shape-space •Integer shape-space •Binary shape-space •Symbolic shape-space D= √∑i =1 L (Abi −Ag i )2 Artificial Immune System
  20. 20. 20 • Machine Learning Meetup Immune Systems Artificial Immune System
  21. 21. 21 • Machine Learning Meetup Immune Systems Meta-Frameworks Artificial Immune System
  22. 22. 22 • Machine Learning Meetup Immune Systems Artificial Immune System
  23. 23. 23 • Machine Learning Meetup Immune Systems Artificial Immune Recognition System
  24. 24. 24 • Machine Learning Meetup Immune Systems Hybrid Immune Learning
  25. 25. 25 • Machine Learning Meetup Immune Systems Hybrid Immune Learning Real-Valued Negative Selection
  26. 26. 26 • Machine Learning Meetup Immune Systems •Idiotypic network (Jerne, 1974) •B cells co-stimulate each other •Treat each other a bit like antigens •Creates an immunological memory Immune Network Theory
  27. 27. 27 • Machine Learning Meetup Immune Systems For natural immune system, all cells of body are categorized as two types of self and non-self. The immune process is to detect non-self from cells. use the Positive Selection Algorithm (PSA) to perform the non-self detection for recognizing the malicious executable. Non-self Detection Principle
  28. 28. 28 • Machine Learning Meetup Immune Systems Network Security
  29. 29. 29 • Machine Learning Meetup Immune Systems Intrusion Detection Systems
  30. 30. 30 • Machine Learning Meetup Immune Systems Network Security Architecture of anomaly detection system.
  31. 31. 31 • Machine Learning Meetup Immune Systems Movie Recomendation Systems
  32. 32. 32 • Machine Learning Meetup Types Machine Learnig • Supervised learning : 지도학습 • Data의 종류를 알고 있을 때(Category, Labeled) • ex: spam mail • Unsupervised : 비지도학습 • Data의 종류는 모르지만 패턴을 알고 싶을 때 • SNS, Twitter • Semi-supervised learning : 지도학습 + 비지도학습 • Reinforcement learning : 강화학습 • 잘못된 것을 다시 피드백 • Evolutionary learning : 진화학습(GA, AIS) • Meta Learning : Landmark of data for classifier
  33. 33. 33 • Machine Learning Meetup Genetic algorithm Machine Learnig
  34. 34. 34 • Machine Learning Meetup Genetic algorithm Abnormal Behavior Machine Learnig
  35. 35. 35 • Machine Learning Meetup Types of Anomaly Machine Learnig
  36. 36. 36 • Machine Learning Meetup Association Rule Mining Machine Learnig
  37. 37. 37 • Machine Learning Meetup Finite State Automata (FSA) Since the tests in can be grouped, the states can represent the several tests being performed at the same time. For example, T34 means that T3 and T4 can be done simultaneously Machine Learnig
  38. 38. 38 • Machine Learning Meetup Clustering Machine Learnig
  39. 39. 39 • Machine Learning Meetup Hidden Markov Sequence Based Algorithm •Certain fraudulent activities may not be detectable with instance based algorithms •small amount of money, instance based algorithms will fail to detect the fraud Machine Learnig
  40. 40. 40 • Machine Learning Meetup Decision Tree Profiling? Machine Learnig
  41. 41. 41 • Machine Learning Meetup Support Vector Machine Machine Learnig
  42. 42. 42 • Machine Learning Meetup Neural Network Single Layer Feed Forward Model Machine Learnig
  43. 43. 43 • Machine Learning Meetup anti-k nearest neighbor Outlier Detection Machine Learnig
  44. 44. 44 • Machine Learning Meetup Comparison of Three Algorithms Machine Learnig
  45. 45. 45 • Machine Learning Meetup Classical rule-based approach • Always “too late”: • New fraud pattern is “invented” by criminals • Cardholders lose money and complain • Banks investigate complains and try to understand the new pattern • A new rule is implemented a few weeks later • Expensive to build (knowledge intensive) • Difficult to maintain: • Many rules • The situation is dynamically changing, so frequently • rules have to be added, modified, or removed … Solutions
  46. 46. 46 • Machine Learning Meetup Solutions • Storage • hadoop • HDFS: Distributed File System(DFS) • MapReduce : parallel processing • Algorithms • on-line learning (Immune System and Genetic Algorithms) • batch model • direct data • Stream • Neural stream • Decentralize decision process • Cell base detection • Network for Artificial Immune Systems • Storm, Samja can’t use on-line learning Neural Stream
  47. 47. 47 • Machine Learning Meetup Solutions • Every bank user gets a vector of parameters that describe his/her behavior: an “average-behavior” profile • The system constantly compares this “long-term” profile with the recent behavior of cardholder • Transactions that do not fit into bank user’s profile are flagged as suspicious (or are blocked) • Profiles are updated with every single transaction, so the system constantly adopts to (slow and small) changes in bank user’ behavior A system based on profiles
  48. 48. Q&A Thanks

×