next up previous contents
suivant: À propos de ce monter: corpus précédent: Preuve de la proposition   Table des matières

Bibliographie

Abott, 1952
Abott, E. (1952).
Flatland. A Romance in Many Dimensions.
Dover Publications, New York.

Albus, 1971
Albus, J. (1971).
A theory of cerebellar function.
Mathematical Biosciences, 10:25-61.

Albus, 1981
Albus, J. (1981).
Brain, Behavior, and Robotics.
Byte Books.

Ameisen, 1999
Ameisen, J. (1999).
La scupture du vivant. Le suicide cellulaire ou la mort créatrice.
Seuil.

Anderson, 1989
Anderson, C. (1989).
Learning to control an inverted pendulum using neural networks.
IEEE Control Systems Magazine, 9(3):31-37.

Bain, 1873
Bain, A. (1873).
Mind and Body. The Theories of Their Relation.
Henry King, London.

Baird, 1995
Baird, L. C. (1995).
Residual algorithms: Reinforcement learning with function approximation.
Proceedings of the Twelfth International Conference on Machine Learning, pages 30-37.

Barto et al., 1983
Barto, A., Sutton, R., and Anderson, C. (1983).
Neurolike adaptive elements that can solve difficult learning control problems.
IEEE Transactions on Systems, Man, and Cybernetics, SMC13:834-846.

Bellman, 1957
Bellman, R. (1957).
Dynamic Programming.
Princeton University Press, Princeton, NJ.

Bersekas et Tsitsiklis, 1996
Bersekas, D. and Tsitsiklis, J. (1996).
Neuro-dynamic Programming.
Athena Press.

Bersini et Gorrini, 1996
Bersini, H. and Gorrini, V. (1996).
Three connectionist implementations of dynamic programming for optimal control: A preliminary comparative analysis.
In Workshop on Neural Networks for Identification and Control in Robotics.

Berthoz, 1997
Berthoz, A. (1997).
Le sens du mouvement.
Odile Jacob.

Bertsekas, 1987
Bertsekas, D. (1987).
Dynamic Programming.
Prentice Hall.

Bouchon, 1988
Bouchon, B. (1988).
Entropic models: a general framework for measures of uncertainty and information.
Logic in Knowledge-Based Systems, Decision and Control, pages 93-105.

Bouchon-Meunier, 1989
Bouchon-Meunier, B. (1989).
Incertitude, information, imprécision: une réflexion sur l'évolution de la théorie de l'information.
Revue Internationale de Systémique, 3(4):375-385.

Brooks, 1986
Brooks, R. A. (1986).
A robust layered control system for a mobile robot.
IEEE Journal of Robotics and Automation, RA-2(1):14-23.

Carnap, 1950
Carnap, R. (1950).
Logical Foundations of Probability.
University of Chicago Press.

Changeux, 1983
Changeux, J. (1983).
L'homme neuronal.
Fayard.

Damasio, 1994
Damasio, A. (1994).
Descartes'Error: Emotion, Reason and the Human Brain.
Picador.

Damasio, 1999
Damasio, A. (1999).
Le sentiment même de soi - Corps, émotions, conscience.
Editions Odile Jacob Sciences.

Daubechies, 1992
Daubechies, I. (1992).
Ten Lectures on Wavelets, volume 61 of CBMS-NSF Regional Conference Series in Applied Mathematics.
SIAM, Philadelphia.

Davesne et Barret, 1999a
Davesne, F. and Barret, C. (1999a).
Constraint based memory units for reactive navigation learning.
In European Workshop on Learning Robots.

Davesne et Barret, 1999b
Davesne, F. and Barret, C. (1999b).
Reactive navigation of a mobile robot using a hierarchical set of learning agents.
In IROS'99.

Dayan, 1992
Dayan, P. (1992).
The convergence of td($ \lambda$) for general $ \lambda$.
Machine Learning, 8:341-362.

Dayan et Sejnowski, 1994
Dayan, P. and Sejnowski, T. (1994).
Td($ \lambda$) converges with probability 1.
Machine Learning, 14:295-301.

Denton, 1985
Denton, M. (1985).
Evolution: Une théorie en crise.
Flammarion.

d'Espagnat, 1985
d'Espagnat, B. (1985).
Une incertaine réalité.
Gauthier-Villars.

d'Espagnat, 1994
d'Espagnat, B. (1994).
Le réel voilé.
Fayard.

Edelman, 1992
Edelman, G. (1992).
Bright Air, Brilliant Fire: On the Matter of Mind.
Basic Books, New York.

Gaussier et al., 1997
Gaussier, P., Joulain, C., Zrehen, S., Banquet, J., and Revel, A. (1997).
Visual navigation in an open environment without map to appear in iros conference grenoble.

Gaussier et Zrehen, 1995
Gaussier, P. and Zrehen, S. (1995).
Perac: A neural architecture to control artificial animals.
Robotics and Autonomous Systems, 16(2-4):291-320.

Glorennec, 1994
Glorennec, P. (1994).
Fuzzy q-learning and dynamical fuzzy q-learning.
In FUZZ-IEEE'94, Orlando.

Harnad, 1992
Harnad, S. (1992).
Cognition and the symbol grounding problem.
Electronic symposium on computation.

Hartley, 1928
Hartley, R. (1928).
Transmission of information.
Bell System technical Journal, 7:535-563.

Hebb, 1949
Hebb, D. (1949).
The Organization of Behavior.
John Wiley & Sons, New York.

Hilbert, 1928
Hilbert, D. et Ackermann, W. (1928).
Grundzuge der Theoretischen Logik.
Springer, Berlin.

Hsu et al., 1990
Hsu, F., Anantharaman, T., Campbell, M., and Nowatzyk, A. (1990).
A grandmaster chess machine.
Scientific American, 263(4):11-50.

Jacobs, 1993
Jacobs, O. (1993).
Introduction to Control Theory.
Oxford University Press.

James, 1890
James, W. (1890).
Principles of Psychology.
Henry Holt, New York.

Jaulin et Walter, 1993
Jaulin, L. and Walter, E. (1993).
Set inversion via interval analysis for non-linear bounded estimation.
Automatica, 29(4):1053-1064.

Jaulin et al., 1996
Jaulin, L., Walter, E., and Didrit, O. (1996).
Guaranteed robust nonlinear parameter bounding.
In CESA'96 IMACS Multiconference Symposium on Modelling, Analysis and Simulation, volume 2, pages 1156-1163, Lille.

Jouffe, 1997
Jouffe, L. (1997).
Apprentissage de systèmes d'inférence floue par des méthodes de renforcement: application à la régulation d'ambiance dans un bâtiment d'élevage porcin.
Thèse de doctorat, Université de Rennes 1.

Jutten et Herault, 1991
Jutten, C. and Herault, J. (1991).
Blind separation of sources, part i: An adaptive algorithm based on neuromimetic architecture.
Signal Processing, 24:1-10.

Kaebling et al., 1996
Kaebling, L., Littman, M., and Moore, A. (1996).
Reinforcement learning: A survey.
Journal of Artificial Intelligence Research, 4:237-285.

Kahneman et Tversky, 1979
Kahneman, D. and Tversky, A. (1979).
Prospect theory: An analysis of decision under risk.
Econometrica, 47:263-291.

Kalman, 1960
Kalman, R. (1960).
A new approach to linear filtering and prediction problems.
Transaction of the ASME - Journal of Basic Engineering, pages 35-45.

Kohonen, 2001
Kohonen, T. (2001).
Self-Organizing Maps, volume 30.
Springer Series in Information Sciences, Berlin.

Lecerf, 1997
Lecerf, C. (1997).
Une leçon de piano ou la double boucle de l'apprentissage cognitif, volume 3.
Travaux et Documents, Université Paris 8 Vincennes-Saint-Denis.

Lin et Kim, 1991
Lin, C. and Kim, H. (1991).
Cmac-based adaptive critic self-learning control.
IEEE Transactions on Neural Networks, 2:530-533.

Lin, 1992
Lin, L.-J. (1992).
Self-improving reactive agents based on reinforcement learning, planning and teaching.
Machine Learning, 8:293-321.

Maaref et al., 1999
Maaref, H., Barret, C., and Amamou, A. (1999).
Optimization of a fuzzy controller for a reactive navigation.
Computational Intelligence and Applications, pages 193-197.

McCulloch et Pitts, 1943
McCulloch, W. and Pitts, W. (1943).
A logical calculus of the ideas immanent in nervous activity.
Bulletin of Mathematical Biophysics, 5:115-137.

McGovern et al., 1998
McGovern, A., Precup, D., Ravindran, B., Singh, S., and Sutton, R. (1998).
Hierarchical optimal control of mdps.
Proceedings of the Tenth Yale Workshop on Adaptive and Learning Systems, pages 186-191.

Meyer et Wilson, 1991
Meyer, J. and Wilson, S. (1991).
editors: Proceedings of the first international conference on simulation of adaptive behavior - from animals to animats.

Michel, 1996
Michel, O. (1996).
Khepera simulator package version 2.0: Freeware mobile robot simulator.
http://wwwi3s.unice.fr/~om/khep-sim.html.

Mondada et al., 1994
Mondada, F., Franzi, E., and Ienne, P. (1994).
Mobile robot miniaturization: A tool for investigation in control algorithms.
In Yoshikawa, T. and Miyazaki, F., editors, Proceedings of the Third International Symposium on Experimental Robotics 1993, pages 501-513. Springer Verlag,.

Moore, 1979
Moore, R. (1979).
Methods and Applications of Interval Analysis.
SIAM, Philadelphia.

Munos, 1997
Munos, R. (1997).
Apprentissage par Renforcement, Étude du cas Continu.
Thèse de doctorat, EHESS, CEMAGREF.

Nowé, 1995
Nowé, A. (1995).
Fuzzy reinforcement learning: an overview.
Advances in fuzzy theory and technology.

O'Regan et Noë, 2001
O'Regan, J. and Noë, A. (2001).
A sensorimotor account of vision and visual consciousness.
Behavioral and Brain Sciences, 24(5).

O'Regan et al., 1999
O'Regan, J., Rensink, R., and Clark, J. (1999).
Blindness to scene changes caused by mudsplashes.
Nature, 398.

Pavlov, 1927
Pavlov, I. (1927).
Conditionned refexes.
Oxford University Press.

Pendrith, 1994
Pendrith, M. (1994).
On reinforcement learning of control actions in noisy and non-markovian domains.
Technical report, UNSW Computer Science and Engineering.

Pendrith, 1999
Pendrith, M. (1999).
Reinforcement learning in situated agents: Some theoretical problems and practical solutions.
In 8th European Workshop on Learning Robots.

Pendrith et McGarity, 1998
Pendrith, M. and McGarity, M. (1998).
An analysis of direct reinforcement learning in non-markovian domains.
The Fifteenth International Conference on Machine Learning.

Peng et Williams, 1996
Peng, J. and Williams, R. (1996).
Incremental multi-step q-learning.
Machine Learning, 22:283-290.

Pham et al., 1992
Pham, D., Garrat, P., and Jutten, C. (1992).
Separation of a mixture of independent sources through a maximum likelihood approach.
In Proc. EUSIPCO, pages 771-774.

Pitrat, 1990
Pitrat, J. (1990).
Métaconnaissance - Futur de l'intelligence artificielle.
Hermès.

Pradel et Barret, 1998
Pradel, G. and Barret, C. (1998).
Environment recognition in mobile robotics by means of neural networks.
Journal Européen des Systèmes Automatisés, 32:939-963.

Quinlan, 1984
Quinlan, J. (1984).
Learning efficient classification procedures and their application to chess endgames.
Machine Learning. An Artificial Approach, pages 463-482.

Rasmussen, 1986
Rasmussen, J. (1986).
Information Processing and Human-Machine Interaction; An approach to cognitive engineering.
Elsevier Science Publishing Co, North-Holland.

Reeke et al., 1990
Reeke, G., Sporns, O., and Edelman, G. (1990).
Synthetic neural modeling: The ``darwin'' series of recognition automata.
Proc. of the IEEE, 78(9):1498-1530.

Reich, 1949
Reich, W. (1949).
L'analyse caractérielle.
Payot.

Revel, 1997
Revel, A. (1997).
Contrôle d'un robot mobile autonome par approche neuromimétique.
Thèse de doctorat, Université de Cergy-Pontoise.

Rich, 1983
Rich, E. (1983).
Artificial Intelligence.
McGraw-Hill.

Rosenblatt, 1958
Rosenblatt, F. (1958).
The perceptron: A probabilistic model for information storage and organization in the brain.
Psychological Review, 65:386-408.

Rumelhart et al., 1986
Rumelhart, D., Hinton, G., and Williams, R. (1986).
Learning internal representations by error propagation.
Nature, 323:533-536.

Rummery, 1995
Rummery, G. (1995).
Problem Solving with Reinforcement Learning.
PhD thesis, Cambridge University.

Samuel, 1959
Samuel, A. (1959).
Some studies in machine learning using the game of checkers.
IBM Journal of Research and Development, 3:211-229.

Sauvage, 1999
Sauvage, G. (1999).
Les marchés financiers. Entre hasard et raison: le facteur humain.
Seuil.

Schwartz, 1993
Schwartz, A. (1993).
A reinfocement learning method for maximising undiscounted rewards.
Proceeding of Int. Conf. on Machine Learning.

Shannon, 1948
Shannon, C. (1948).
A mathematical theory of communication.
Bell System technical Journal, 27:379-423,623-656.

Shortliffe et Buchanan, 1975
Shortliffe, E. and Buchanan, B. (1975).
A model of inexact reasoning in medicine.
Mathematical Biosciences, 23:351-379.

Sutton, 1984
Sutton, R. (1984).
Temporal Credit Assignment in Reinforcement Learning.
PhD thesis, University of Massachusetts, Amherst, MA.

Sutton, 1988
Sutton, R. (1988).
Learning to predict by the method of temporal differences.
Machine Learning, 3:9-44.

Sutton, 1996
Sutton, R. (1996).
Generalization in reinforcement learning: Successful examples using sparse coarse coding.
Advances in Neural Information Processing Systems: Proceedings of the 1995 Conference, pages 1038-1044.

Sutton et Barto, 1998
Sutton, R. and Barto, A. (1998).
Reinforcement Learning: An introduction.
MIT Presss, Cambridge, MA.

Tanaka, 1995
Tanaka, K. (1995).
Stability and stabilizability of fuzzy-neural control systems.
IEEE Transactions on Fuzzy Systems, 3(4):438-447.

Tesauro, 1994
Tesauro, G. (1994).
Td-gammon, a self-teaching backgammon program, achieves masterlevel play.
Neural Computation, 6:215-219.

Thagard et Barnes, 1996
Thagard, P. and Barnes, A. (1996).
Emotional decisions.
Proceedings og the Eighteenth Annual Conference of The Cognitive Science Society, pages 426-429.

Thagard et Millgram, 1997
Thagard, P. and Millgram, E. (1997).
Inference to the best plan: A coherence theory of decision.
Goal-Driven Learning, pages 439-454.

Touzet et al., 1995
Touzet, C., Sehad, S., and Giambiasi, N. (1995).
Improving reinforcement learning of obstacle avoidance behavior with forbidden sequences of actions.
In International Conference on Robotics and Manufacturing, Cancun, Mexico, 14-16 June 1995.

Tsitsiklis et Roy, 1996
Tsitsiklis and Roy, V. (1996).
Feature-based methods for large scale dynamic programming.
Machine Learning, 22:59-94.

Tsitsiklis, 1994
Tsitsiklis, J. (1994).
Asynchronous stochastic approximation and q-learning.
Machine Learning, 16:185-202.

Turing, 1936
Turing, A. (1936).
On computable numbers, with an application to the entscheidungsproblem.
Proceedings of the London Mathematical Society, 42(2):230-265.

Turing, 1950
Turing, A. (1950).
Computing machinery and intelligence.
Mind, 59:433-460.

Tversky et Kahneman, 1981
Tversky, A. and Kahneman, D. (1981).
The framing of decisions and the psychology of choice.
Science, 211:453-458.

Wang et al., 1996
Wang, H., Tanaka, H., and Griffin, M. (1996).
An approach to fuzzy control of nonlinear systems: Stability and design issues.
IEEE Transactions on Fuzzy Systems, 4(1):14-23.

Watkins, 1989
Watkins, C. (1989).
Learning from Delayed Rewards.
PhD thesis, King's College, Cambridge, UK.

Weizenbaum, 1976
Weizenbaum, J. (1976).
Computer Power and Human Reason.
W.H. Freeman.

Whitehead, 1992
Whitehead, S. (1992).
Reinforcement Learning for the Adaptive Control of Perception and Action.
PhD thesis, King's College, Cambridge, England.

Wiering, 1999
Wiering, M. (1999).
Explorations in Efficient Reinforcement Learning.
PhD thesis, Universiteit van Amsterdam.

Wiering et Schmidhuber, 1997
Wiering, M. and Schmidhuber, J. (1997).
Hq-learning.
In Adaptive Behavior, volume 6:2, pages 219-246.



2002-03-01